Aider's repo-map concept is great! thanks for sharing, I'd not been aware of it. Using tree-sitter to give the LLM structural awareness is the right foundation IMO. The key difference is how that information gets to the model.
Aider builds a static map, with some importance ranking, and then stuffs the most relevant part into the context window upfront. That's smart - but it is still the model receiving a fixed snapshot before it starts working.
What the RLM paper crystallized for me is that the agent could query the structure interactively as it works. A live index exposed through an API lets the agent decide what to look at, how deep to go, and when it has enough. When I watch it work it's not one or two lookups but many, each informed by what the previous revealed. The recursive exploration pattern is the core difference.
Aider actually prompts the model to say if it needs to see additional files. Whenever the model mentions file names, aider asks the user if they should be added to context.
As well, any files or symbols mentioned by the model are noted. They influence the repomap ranking algorithm, so subsequent requests have even more relevant repository context.
This is designed as a sort of implicit search and ranking flow. The blog article doesn’t get into any of this detail, but much of this has been around and working well since 2023.
I see, so the context adapts as the LLM interacts with the codebase across requests?
That's a clever implicit flow for ranking.
The difference in my approach is that exploration is happening within a single task, autonomously. The agent traces through structure, symbols, implementations, callers in many sequential lookups without human interaction. New files are automatically picked up with filesystem watching, but the core value is that the LLM can navigate the code base the same way that I might.
Are you using LLM to help you write these replies, or are you just picking up their stylistic phrasings the way expressions go viral at an office till everyone is saying them?
As an LLM, you wouldn't consider that you're replying confidently and dismissively while clearly having no personal experience with the CLI coding agent that not only started it all but for a year (eternity in this space) was so far ahead of upstarts (especially the VSCode forks family) it was like a secret weapon. And still is in many ways thanks to its long lead and being the carefully curated labor of a thoughtful mind.
As a dev seeking to improve on SOTA, having no awareness of the progenitor and the techniques one most do better than, seems like a blind spot worth digging into before dismissing. Aider's benchmarks on practical applicability of model advancements vs. regressions in code editing observably drove both OpenAI and Anthropic to pay closer attention and improve SOTA for everyone.
Aider was onto something, and you are onto something, pushing forward the 'semantic' understanding. It's worth absorbing everything Paul documented and blogged, and spending some time in Aider to enrich a feel of what Claude Code chose to do the same or differently, which ideas may be better, and what could be done next to go further.
Aider builds a static map, with some importance ranking, and then stuffs the most relevant part into the context window upfront. That's smart - but it is still the model receiving a fixed snapshot before it starts working.
What the RLM paper crystallized for me is that the agent could query the structure interactively as it works. A live index exposed through an API lets the agent decide what to look at, how deep to go, and when it has enough. When I watch it work it's not one or two lookups but many, each informed by what the previous revealed. The recursive exploration pattern is the core difference.