NitpickLawyer 7 days ago

Eh... I'm not convinced. I like cline, I've been using it here and there and I think it found a good mix between "vibesus take the wheel" and "hey, I'm still here and I enjoy doing this". I was particularly surprised that it worked pretty well with local models. A lot of that is on the model (tested w/ devstral) but a lot of it is on the cradle (e.g. aider is great at what it does, but local model support is hit and miss).

First, like some other comments have mentioned RAG is more than result = library.rag(). I get that a lot of people feel RAG is overhyped, but it's important to have the right mind model around it. It is a technique first. A pattern. Whenever you choose what to include in the context you are performing RAG. Retrieve something from somewhere and put it in context. Cline seems to delegate this task to the model via agentic flows, and that's OK. But it's still RAG. The model chooses (via tool calls) what to Retrieve.

I'm also not convinced that embedding can't be productive. I think nick is right to point out some flaws in the current implementations, but that doesn't mean the concept in itself is flawed. You can always improve the flows. I think there's a lot to gain from having embeddings, especially since they capture things that ASTs don't (comments, doc files, etc).

Another aspect is the overall efficiency. If you have somewhat repetitive tasks, you'll do this dance every time. Hey, fix that thing in auth. Well, let's see where's auth. Read file1. Read file2. Read fileN. OK, the issue is in ... You can RAG this whole process once and re-use (some) of this computation. Or you can do "graphRAG" and do this heavy lifting once per project and have AST + graph + model dump that can be RAGd. There's a lot of cool things you can do.

In general I don't think we know enough about the subject, best practices and useful flows to confidently say "NO, never, nuh-huuh". I think there might be value there, and efficiencies to be gained, and some of them seem like really low hanging fruit. Why not take them?

1
avereveard 7 days ago

at some point they will move from scanning files to scannign the AST and then token consumption will be greatly reduced by default, the challenge is that then you need something generic enough like tree-sitter to reduce the monumental effort of integrating a number of parsers.

layer8 7 days ago

Why would an AST greatly reduce LLM token consumption?

avereveard 7 days ago

a lot of token are used reading files whole just to understand where to fit the feature requested and the edit point, access to an AST would allow the llm to see the project "wireframe" so to say, by asking classes or method level granularity, and only then retrieving the source for the symbol that most likely contains the edit point the llm needs. some token consumption there is anovaidable as the llm need the source to build a diff, but still, it's focused on the feature than the search.

bicepjai 7 days ago

More focus on what to use rather when the whole file where the code snippet sits

layer8 7 days ago

I see, it could in principle more easily prune subtree that aren’t relevant. Initially I was assuming that the LLM would still ingest the whole AST in some form, since OP wrote “scanning the AST”. Does that mean the LLM would be invoking some sort of tool to perform a query on the AST?