Just like with human engineers, you need to start with a planning session. This involves a back and forth discussion to hammer out the details before writing any code. I start off as vague as possible to see if the LLM recommends anything I hadn't thought of, then get more detailed as I go. When I'm satisfied, I have it create 2 documents, initialprompt.txt and TODO.md. The initial prompt file includes a summary of the project along with instructions to read the to do file and mark each step as complete after finishing it.
This ensures the LLM has a complete understanding of the overall goals, along with a step by step list of tasks to get there. It also allows me to quickly get the LLM back up to speed when I need to start a new conversation due to context limits.
In essence, I need to schedule a meeting with the LLM and 'hammer out a game plan.' Gotta make sure we're 'in sync' and everybody's 'on the same page.'
Meeting-based programming. No wonder management loves it and thinks it should be the future.
LLMs are stealing the jobs of developers who go off half-cocked and spend three days writing 2000 lines of code implementing the wrong feature instead of attending a 30 minute meeting
That's dumb, of course, but sometimes people really just do the bare minimum to describe what they want and they can only think clearly once there's something in front of them. The 2000 lines there should be considered a POC, even at 2000 lines.
my manager has been experimenting have AI first right the specs as architecture decision records (ADR), then explain how the would implement them, then slowly actually implementing with lots of breaks, review and approval/feedback. He says it's been far superior to typically agent coding but not perfect.
> This ensures the LLM has a complete understanding of the overall goals
Forget about overall goal. I have this simple instruction that i send on every request
"stop after every failing unit test and discuss implementation with me before writing source code "
but it only does that about 7 times out of 10. Other times it just proceeds with implementation anyways.
Ive found similar behaviour with stopping at linting errors. I wonder if my instructions are conflicting with the agent system prompt.
System prompts themselves have many contradictions. I remember hearing an Anthropic engineer (possibly Lex Fridman's interview with Amanda Askell) talking about using exaggerated language like "NEVER" just to steer Claude to rarely do something.
So it behaves just like a person.
thats why we replaced people with machines . so we can have some predictability.
Keyword: Some
humans don't ignore an instruction 4 times out of 10 unless they have a reason to do it on purpose.
I congratulate you in that you only work with humans that never misunderstand, never forget a step in a long process they think they know by heart etc.
I guess you also think that we should get rid of checklists for pilots because they would never ignore an instruction they were clearly given during training except on purpose?
> I guess you also think that we should get rid of checklists for pilots because they would never ignore an instruction they were clearly given during training except on purpose?
Pilots ignore items in checklist 4 times out of 10? wtf
Sadly this just doesn't pan out in larger more complex projects. It will write an implementation plan, not follow it, then lie and say it did.
What tool and/or model are you calling "it"?
I'm using Claude Code on a large legacy monstrosity, and don't have this problem. There are problems and my flow automatically has it reviewing its own work in phased implementations, but even in the worst situations it's easy to get back on track.
> I have it create 2 documents, initialprompt.txt and TODO.md.
That is an amazing approach. Which (AI) tools are you using?