Fair point Jeff -- you're right that we're still doing retrieval. The key distinction is how we retrieve.
Traditional RAG for code uses vector embeddings and similarity search. We use filesystem traversal and AST parsing - following imports, tracing dependencies, reading files in logical order. It's retrieval guided by code structure rather than semantic similarity.
I highly recommend checking out what the Claude Code team discovered (48:00 https://youtu.be/zDmW5hJPsvQ?si=wdGyiBGqmo4YHjrn&t=2880). They initially experimented with RAG using embeddings but found that giving the agent filesystem tools to explore code naturally delivered significantly better results.
From our experience, vector similarity often retrieves fragments that mention the right keywords but miss the actual implementation logic. Following code structure retrieves the files a developer would actually need to understand the problem.
So yes -- I should have been clearer about the terminology. It's not "no retrieval" -- it's structured retrieval vs similarity-based retrieval. And with today's frontier models having massive context windows and sophisticated reasoning capabilities, they're perfectly designed to build understanding by exploring code the way developers do, rather than needing pre-digested embeddings.
Probably good to add a disclaimer at the top that clarifies the definition, since RAG is ultimately just a pattern, and vector indexes are just one way to implement the pattern.
Indeed, industry at large sees RAG as equivalent to "vector indexes and cosine similarity w.r.t. input query", and the rest of the article explains thoroughly why that's not the right approach.
> industry at large sees RAG as equivalent to "vector indexes and cosine similarity w.r.t. input query"
Yep, and this is getting really old. Information retrieval is not a new problem domain. Somehow, when retrieved info is fed into an LLM, all nuance is lost and we end up with endless pronouncements about whether retrieval is/is not "dead".
Following dependencies is the way to go IMHO. Saying "Code Doesn't Think in Chunks", is IMHO not correct. Developers do thing in chunks of codes. E.g. this function calls that function uses that type and is used here and there. It is not really a file based model like Cline uses. The file based model is "just" simpler to implement :-) . We use a more sophisticated code chunking approach in https://help.sap.com/docs/build_code/d0d8f5bfc3d640478854e6f... Let's see, maybe we should open source that ...
Maybe you should indeed! Without looking into too much details, how CLI-friendly would this tool be?
Hi, nick, given that this product is opensourced, I have a request/ wish:
It would be wondeful if some of the tools the projects uses are exposed to build on. Like the tools related to AST, finding definitions, and many more
If you're putting everything in the context window, is it still considered "retrieval"? Did we have a preexisting robust definition of what constitutes retrieval?
Don’t take this the wrong way, but did you use an LLM to generate this reply? The reply is good, but the writing style just piqued my curiosity.
This may not be what is meant here, but I wonder if in the future anybody who actually bothered to learn to write well will automatically be flagged as likely having used AI to write their prose.
For instance, everyone seems to believe that em dashes are something only an AI would use -- but I've been using them in my writing for a long time.
I wonder if the more we use LLMs the more our written patterns will begin to match them. Those of us who work with them the most are likely to be affected earliest.