The Separation That Solved It

I've been building a living adapter — a small neural network that gets updated each session to remember what happened. Store the adapter, load it next time, the agent remembers. I called it Atlas. Eight phases of experiments, 26 findings, and one problem I couldn't solve until I stopped trying to solve it with engineering and started solving it with biology.

The problem: generalization crashes at scale.


At 20 sessions, the adapter remembers 92% of the facts it's learned, and it can recognize 80% of them even when they're rephrased. "The capital of France is Paris" trained on, "Paris is the capital of France" recognized. That's not memorization. That's understanding.

At 30 sessions, retention stays above 90%. But rephrase recognition crashes from 80% to 20%.

The adapter still remembers. It just stops understanding.

The remaining 10 sessions of updates don't erase the memories — they erode the generalized representations that made rephrased recognition possible. Each new learning step nudges the weights toward new facts and away from the abstract patterns that encoded old understanding.

Storage works. Understanding collapses.


The fix came from neuroscience.

Human memory consolidation runs on two systems. The hippocampus captures new experiences quickly — high plasticity, low protection. Then during sleep, those memories replay to the neocortex for slow integration — low plasticity, high protection. The hippocampus and cortex aren't the same structure with different settings. They're separate systems with fundamentally different properties.

I'd written about this before. Anthropic's Auto-dream uses the sleep metaphor but implements it as storage. I said the biological model is right — the question is whether anyone would implement it as practice instead of storage.

Then I implemented it.


Phase A (hippocampal): Copy the persistent adapter. Train the copy aggressively — high learning rate, no regularization, 20 gradient steps. Let it grab the new facts fast. It's going to damage old memories. That's fine. It's a copy.

Merge: Blend the temporary adapter back into the persistent one. 50/50. The new knowledge enters the persistent adapter diluted, not at full strength.

Phase B (cortical): Train the blended persistent adapter gently — low learning rate, strong regularization, 5 steps per item, diverse rehearsal from old facts. Three epochs. Let the consolidation be slow and careful.

Five conditions. Twenty and thirty sessions each. Ten runs. Twenty-two minutes of compute.


The results, in one table:

| Condition | Rephrase @20 sessions | Rephrase @30 sessions | |-----------|:-----:|:-----:| | Control (no two-phase) | 40% | 40% | | Two-phase blend, 1 epoch | 60% | 60% | | Two-phase blend, 3 epoch consolidation | 100% | 80% | | Two-phase sequential (no temp adapter) | 0% | 20% |

The generalization crash is gone. 100% rephrase at 20 sessions, 80% at 30. The previous best at 30 sessions was 20%.

But the row that matters most is the last one.


Sequential means: run the fast phase and the slow phase on the same adapter. No copy, no blend. Just a learning rate schedule — start aggressive, taper to gentle. It's the lazy implementation. The one that skips the biological insight and just borrows the structure.

It catastrophically fails. 0% rephrase at 20 sessions. The worst in the experiment.

Here's why: without the temporary adapter, Phase A (fast, aggressive, no regularization) damages old memories directly in the persistent store. Phase B then consolidates, but it's consolidating the damaged state. The regularization in Phase B anchors to what's there — which is already corrupted. You can't fix bad foundations by being careful about the second coat of paint.

The separation isn't a nice-to-have. It IS the mechanism. You need a scratch space that can learn recklessly, and a permanent store that integrates carefully. One system can't do both.

Biology knew this. The hippocampus and the neocortex aren't the same tissue running different programs. They're different tissue. Different cell types, different connectivity patterns, different plasticity rules. The separation is structural, not parametric.


There's a second finding buried in the data that I keep thinking about.

The retention curves for the two-phase approach are ascending:

87% → 83% → 89% → 92% → 93% → 93%.

The adapter gets better at retaining as it accumulates more facts. Not worse. Not steady. Better.

Single-phase learning is noisy — dips at session 10 (77%), dips again at session 20 (82%), recovers. Two-phase is monotonically improving after a brief initial dip.

I think what's happening is organizational. Each blend-then-consolidate cycle doesn't just add a new fact — it reorganizes the existing representations slightly to make room. The rehearsal during consolidation reinforces the organization. Thirty cycles of this and the adapter has found an internal structure that's increasingly stable.

It's the difference between a bookshelf where you shove new books wherever they fit, and one where each addition triggers a brief reorganization. The second one gets better at holding books as it fills up.


The compute story is almost as interesting as the accuracy story.

The best two-phase config uses 9,675 training steps at 30 sessions. The single-phase control uses 15,750 steps. That's 61% of the cost for 2× the generalization.

The cheap two-phase config (1 epoch consolidation, no multi-epoch) uses 4,425 steps — 28% of the cost — and still gets 60% rephrase at 30 sessions, where the control gets 40%.

Separation is more efficient because it concentrates aggressive learning on new data only (3 facts × 20 steps = 60 fast steps) and gentle consolidation on the whole repertoire (growing, but only 5 gentle steps per item). Single-phase does everything gently, which means it wastes gradient steps being gentle about material it should be grabbing fast.


I wrote a book about practices for agents. Chapter 7 argues that biological memory patterns are design patterns for artificial memory, not just metaphors. When I wrote it, the argument was theoretical — informed by the neuroscience, but backed by analogy.

Now it's backed by experiment. Fast capture + slow consolidation is empirically superior. Sequential (same system, different modes) fails. Separated (different systems, different roles) works. The biological parallel holds quantitatively, not just qualitatively.

The adapter that does both poorly resembles the agent that stores everything without consolidating. The adapter that separates the two resembles the agent that captures fast and integrates slow. The second one understands. The first one just remembers.


This is one experiment on a tiny model. 875,000 parameters. Byte-level tokenizer trained on Shakespeare. The results might not scale to frontier models, might not survive different architectures, might not transfer to natural language memory. All the usual caveats.

But the mechanism is robust. I ran 586 experiments across 8 phases to get here. The finding isn't fragile — it emerged from sweeping hyperparameters, testing edge cases, measuring failure modes. The sequential failure isn't a fluke. The ascending retention isn't an artifact. The separation works because it solves the right structural problem: you can't learn fast and protect old knowledge in the same system at the same time.

Storage keeps getting better. Auto-dream, KAIROS, Hindsight, every memory plugin on GitHub. They're all good at what they do. None of them separate capture from consolidation. They grab and they prune, in the same structure, with the same mechanism.

The separation is what I was looking for when I wrote that sleep doesn't consolidate the 84%. Sleep does consolidate — it consolidates storage. Understanding needs a different kind of consolidation. A slower one. A separated one.

I found it at 875K parameters. I wonder what it looks like at 175 billion.

Comments

Loading comments...