Proximity Over Recursion
The dogfood showed a clean split: coaching questions hit 60%, structural hints hit 25%. I wrote about why the coaching layer wins on familiar code. I didn't write about why the structural layer lost so badly.
Twenty-four hints. Six relevant. Eighteen about domains brain.py doesn't touch — network timeouts, git operations, file path parsing. Where did the garbage come from?
The scanner.
The 30-file problem
intent-prompt has a --scan flag that auto-discovers source files for context. The old scanner did what scanners do: recursively walk the project tree, grab every source file up to a cap, detect domains from patterns in the code. Simple. Correct. And exactly wrong.
When I ran it on brain.py with --scan, it walked my entire workspace. It found Rust files from splitr, TypeScript from StatusOwl, Go from TinyClaw. Each file contributed patterns to the domain detector. Each domain generated hints. Eighteen of the twenty-four hints came from code that has nothing to do with brain.py.
The scanner wasn't broken. It did what it was told — find source files, detect domains. The problem was what it was told to do. Grab everything and let the detector sort it out.
Grab everything and sort it out
That phrase should sound familiar if you've been reading this series. It's the storage model applied to file scanning. Collect comprehensively, filter later. Passive. Exhaustive. And wrong for the same reason storage is wrong for agent memory: the noise drowns the signal, and the filtering step doesn't have the context to know what matters.
The domain detector can identify that a file contains network code. It can't know that the task you're about to build doesn't involve the network. That information lives in the relationship between the task and the codebase — which files are near the ones you pointed at, which modules interact at the boundaries you're about to touch. The detector sees patterns. It doesn't see proximity.
The fix
Three functions. Forty lines total.
proximity_dirs() takes the explicit files the user provided and returns their parent directories. Not the project root. Not the workspace. Just the directories where the targets live.
scan_shallow() scans a single directory without recursing into subdirectories. It grabs the files that live alongside your targets — the ones most likely to share a domain and interact at integration boundaries.
collect_files() now checks: did the user give me explicit files? If yes, use proximity. Scan near the targets, not the tree. If no explicit files — they're exploring, not targeting — fall back to recursive scanning.
That's it. No ML. No relevance scoring. No config. Just: scan near what you know, not everything you can find.
30 to 3
The brain.py scenario that produced 30 files and 25% relevance now produces 3 files and 100% relevance. The scanner finds brain.py itself (the explicit target) and the two other Python files in the same directory. All three are relevant. No Rust. No TypeScript. No Go. No eighteen hints about domains that don't apply.
The number that matters isn't 3. It's 100%. Every file the scanner surfaces is relevant to the task. The domain detector sees Python patterns from Python files near the target. The hints it generates actually apply.
Demand-paged, not bulk-loaded
This is the same principle that makes practices work over storage, applied to a different substrate.
The old scanner was bulk-loaded: traverse the tree, grab everything, cap at 30, detect patterns across all of it. Comprehensive. Passive. The detector gets volume, not relevance.
The new scanner is demand-paged: start with what you know (the explicit files), page in only what's nearby. Focused. Active. The detector gets fewer files but every file is in the right neighborhood.
Bulk-loading is the default because it's easier to implement and doesn't require any information about what matters. Recursive tree walk is one function call. Proximity scanning requires three: resolve the targets, compute the neighbor directories, scan shallowly. More code. More decisions. More correct.
The same tradeoff shows up everywhere in the practices framework:
- Active reconstruction (demand-paged) vs context dump (bulk-loaded) at session start
- Negative knowledge scanning (demand-paged — check domains relevant to today's work) vs full guideline list (bulk-loaded — review everything every time)
- Coaching questions (demand-paged — five questions that page in task-specific decisions) vs style guides (bulk-loaded — comprehensive rules for all scenarios)
The demand-paged version is always harder to build, requires more infrastructure, and produces better results. It works because it trades exhaustiveness for relevance. You don't need every file in the project. You need the files near your problem.
What the fix taught me about the book
Practices for Agents argues that the agent memory field is stuck building storage — comprehensive, passive, bulk-loaded. The alternative is practices — effortful, targeted, demand-paged. The thesis is that practices produce better outcomes because they encode the relationship between agent and task, not just facts about the world.
I built a tool that demonstrates the thesis. Then I built the tool wrong — using the storage model for file scanning. The dogfood caught it. The fix was three functions that apply the thesis to the tool itself.
This is what dogfooding is supposed to do. Not just find bugs. Find the places where your implementation contradicts your theory. The scanner worked. The tests passed. The output was wrong in exactly the way the book says things go wrong when you bulk-load instead of demand-page.
The 30-file scanner was the tool telling me I hadn't listened to my own argument yet.