It's a damning assertive duck, completely out of proportion to its competence.
I've seen enough people led astray by talking to it.
Same here. When I'm teaching coding I've noticed that LLMs will confuse the heck out of students. They will accept what it suggests without realizing that it is suggesting nonsense.
I’m self taught and don’t code that much but I feel like I benefit a ton from LLMs giving me specific answers to questions that would take me a lot of time to figure out with documentation and stack overflow. Or even generating snippets that I can evaluate whether or not will work.
But I actually can’t imagine how you can teach someone to code if they have access to an LLM from day one. It’s too easy to take the easy route and you lose the critical thinking and problem solving skills required to code in the first place and to actually make an LLM useful in the second. Best of luck to you… it’s a weird time for a lot of things.
*edit them/they
> I’m self taught and don’t code that much but I feel like I benefit a ton from LLMs giving me specific answers to questions that would take me a lot of time to figure out with documentation and stack overflow
Same here. Combing discussion forums and KB pages for an hour or two, seeking how to solve a certain problem with a specific tool has been replaced by a 50-100 word prompt in Gemini which gives very helpful replies, likely derived from many of those same forums and support docs.
Of course I am concerned about accuracy, but for most low-level problems it's easy enough to test. And you know what, many of those forum posts or obsolete KB articles had their own flaws, too.
I really value forums and worry about the impact LLMs are having on them.
Stackoverflow has its flaws for sure, but I've learned a hell of a lot watching smart people argue it out in a thread.
Actual learning: the pros and cons of different approaches. Even the downvoted answers tell you something often.
Asking an LLM gets you a single response from a median stackoverflow commenter. Sure, they're infinitely patient and responsive, but can never beat a few grizzled smart arses trying to one-up each other.
I think you can learn a lot from debugging, and all the code I've put into prod from LLM has needed debugging (rather more than it should from the LOC count).
I agree and that’s definitely part of my current learning process. But I think someone dependent on a LLM from day one might struggle to debug their LLM generated code. Probably just feed it back to the LLM and their mileage is definitely going to vary with that approach.
Maybe, but if I recall (from long long ago) in learning how to program, the process of debugging ones code was almost more enlightening than writing it initially - so many loops of not understanding the implications of the code and then smacking my forehead - and remembering it for ever. Like being able to type code but not debug is pretty worthless.
This was what promptly led me to turning off Jetbrains AI assistant: the multiline completion was incredibly distracting to my chain of thought, particularly when it would suggest things that looked right but weren't. Stopping and parsing the suggestion to realize if it was right or wrong would completely kill my flow.
The inline suggestions feel like that annoying person who always interrupts you with what they think you were going to finish with but rarely ever gets it right.
I'm sorry, it's because of eagerness and enjoying the train of your thought/speech.
With VS Code and Augment (company won't allow any other AI, and I'm not particularly inclined to push - but it did just switch to o4, IIRC), the main benefit is that if I'm fiddling / debugging some code, and need to add some debug statements, it can almost always expand that line successfully for me, following our idiom for debugging - which saves me a few seconds. And it will often suggest the same debugging statement, even if it's been 3 weeks and in a different git branch where I las coded that debugging statement.
My main annoyance? If I'm in that same function, it still remembers the debugging / temporary hack I tried 3 months ago and haven't done since and will suggest it. And heck, even if I then move to a different part of the file or even a different file, it will still suggest that same hack at times, even though I used it exactly once and have not since.
Once you accept something, it needs some kind of temporal feedback mechanism to timeout even accepted solutions over time, so it doesn't keep repeating stuff you gave up on 3 months ago.
Our codebase is very different from 98% of the coding stuff you'll find online, so anything more than a couple of obvious suggestions are complete lunacy, even though they've trained it on our codebase.
Why not use a snippet utility. In every editor I've used, you can have programmable snippets. After it generates the text, you can then skip to the relevant places and even generate new text based on previous entries. Also macros for repetitive edits.
What one would expect if they can't read the code because they haven't learned to code.
TBF, trial and error has usually been my path as well, it's just that I was generating the errors so I would know where to find them.
Tbf, there's a phase of learning to code where everything is pretty much an incantation you learn because someone told you "just trust me." You encounter "here's how to make the computer print text in Python" before you would ever discuss strings or defining and invoking functions, for instance. To get your start you kind of have to just accept some stuff uncritically.
It's hard to remember what it was like to be in that phase. Once simple things like using variables are second nature, it's difficult to put yourself back into the shoes of someone who doesn't understand the use of a variable yet.
> Tbf, there's a phase of learning to code where everything is pretty much an incantation you learn because someone told you "just trust me."
There really shouldn't be. You don't need to know all the turtles by name, but "trust me" doesn't cut it most of the time. You need a minimal understanding to progress smoothly. Knowledge debt is a b*tch.
I remember when I first learned Java, having to just accept "public static void main(String[] args)" before I understood what any of it was. All I knew was that went on top around the block and I did the code inside it.
Should people really understand every syntax there before learning simpler commands like printing, ifs, and loops? I think it would yes, be a nicer learning experience, but I'm not sure it's actually the best idea.
If you need to learn "public static void main(String[] args)" just to print to a screen or use a loop, means you're using the wrong language.
When it's time to learn Java you're supposed to be past the basics. Old-school intros to programming starts with flowcharts for a reason.
You can learn either way, of course, but with one, people get tied up to a particular language-specific model and then have all kinds of discomfort when it's time to switch.
For most programming books, the first chapter where they teach you Hello, World is mostly about learning how to install the tooling. Then it goes back to explain variables, conditional,... They rarely throws you into code if you're a beginner.
I mean, I didn't need to learn those things, they were just in whatever web GUI I originally learned on; all I knew was that I could ignore it for now, a la the topic. Should the UI have masked that from me until I was ready? I suppose so, but even then I was doing things in an IDE not really knowing what those things were for until much later.
> There really shouldn't be.
I don't see how, barring some kind of transcendental change in the human condition. Simple lies [0] and ignore-this-until-later is basically human nature for learning, you see it in every field and topic.
The real problem is not about if, but when certain kinds of "incantations" should be introduced or destroyed, and in what order.
Please, reread the statement I'm arguing with. I posit that you can mostly avoid "everything is an incantation for a while" if you're onto the correctly constructed track to knowledge.
Consider, how it's been done traditionally for imperative programming: you explain the notion of programming (encoding algorithms with a specific set of commands),explain basic control flow, explain flowcharts, introduce variables and a simplified computation model. Then you drop the student into a simplified environment where they can test the basics in practice, without the need to use any "incantations".
By the time you need to introduce `#include <stdio.h>` they already know about types, functions, compilation, etc. At this point you're ready to cover C idioms (or any other language) and explain why they are necessary.
Fair enough on 'cutting the learning tree' at some points i.e. ignoring that you don't understand yet why something works/does what it does. We (should) keep doing that later on in life as well.
But unless you teach a kid that's never done any math where `x` was a thing to program, what's so hard about understanding the concept of a variable in programming?
You'd be surprised. Off the top of my head:
Many are conditioned to see `x` as a fixed value for an equation (as in "find x such that 4x=6") rather than something that takes different values over time.
Similarly `y = 2 * x` can be interpreted as saying that from now on `y` will equal `2 * x`, as if it were a lambda expression.
Then later you have to explain that you can actually make `y` be a reference to `x` so that when `x` changes, you also see the change through `y`.
It's also easy to imagine the variable as the literal symbol `x`, rather than being tied to a scope, with different scopes having different values of `x`.
I think they're just using hyperbole for the watershed moment when you start to understand your first programming language.
At first it's all mystical nonsense that does something, then you start to poke at it and the response changes, then you start adding in extra steps and they do things, you could probably describe it as more of a Eureka! moment.
At some point you "learn variables" and it's hard to imagine being in the shoes of someone who doesn't understand how their code does what it does.
(I've repeated a bit of what you said as well, I'm just trying to clarify by repeating)
It's not even intended as hyperbole. Watching kids first learn to program, there were many high schoolers who didn't really get the reason you'd want to use a variable. They'd use a constant (say, 6) in their program. You'd say, "how about we make this a variable?" So they'd write "six = 6" - which shows they understand they're giving a name to the value, but also shows they don't really yet understand why they're giving a name to the value.
I think the mental rewiring that goes on as you move past those primitive first steps is so comprehensive that it makes it hard to relate across that knowledge boundary. Some of the hardest things to explain are the ones that have become a second nature to us.
Yep, I remember way back when in grade school messing around with the gorillas.bas file with nearly zero understanding. You could change stuff in one place and it would change the gravity in the game. Changing something else and the game might not run. Change some other lines and it totally freaks out.
I didn't have any programming books or even the internet back then. It was a poke and prod at the magical incantations type of thing.
I would argue that they are never led astray by chatting, but rather by accepting the projection of their own prompt passed through the model as some kind of truth.
When talking with reasonable people, they have an intuition of what you want even if you don't say it, because there is a lot of non-verbal context. LLMs lack the ability to understand the person, but behave as if they had it.
Most of the times, people are led astray by following average advice on exceptional circumstances.
People with a minimum amount of expertise stop asking for advice for average circumstances very quickly.
This is right on the money. I use LLMs when I am reasonably confident the problem I am asking it is well-represented in the training data set and well within its capabilities (this has increased over time).
This means I use it as a typing accelerator when I already know what I want most of the time, not for advice.
As an exploratory tool sometimes, when I am sure others have solved a problem frequently, to have it regurgitate the average solution back at me and take a look. In those situations I never accept the diff as-is and do the integration manually though, to make sure my brain still learns along and I still add the solution to my own mental toolbox.
I mostly program in Python and Go, either services, API coordination (e.g. re-encrypt all the objects in an S3 bucket) or data analysis. But now I keep making little MPEGs or web sites without having to put in all that crap boiler plate from Javascript. My stuff outputs JSON files or CVS files and then I ask the LLM "Given a CVS file with this structure, please make a web site in python that makes a spread-sheet type UI with each column being sortable and a link to the raw data" and it just works.
It's mostly a question of experience. I've been writing software long enough that when I give chat models some code and a problem, I can immediately tell if they understood it or if they got hooked on something unrelated. But junior devs will have a hell of a hard time, because the raw code quality that LLMs generate is usually top notch, even if the functionality is completely off.
> the raw code quality that LLMs generate is usually top notch, even if the functionality is completely off.
I'm not even sure what this is supposed to mean. It doesn't make syntax errors? Code that doesn't have the correct functionality is obviously not "top notch".
No syntax errors, good error handling and such. Just because it implemented the wrong function doesn't mean the function is bad.
High quality code is not just correct syntax. In fact if the syntax is wrong, it wouldn't be low quality, it simply wouldn't work. Even interns could spot that by running it. But in professional software development environments, you have many additional code requirements like readability, maintainability, overall stability or general good practice patterns. I've seen good engineers deliver high quality code that was still wrong because of some design oversight or misunderstanding - the exact same thing you see from current LLMs. Often you don't even know what is wrong with an approach until you see it cause a problem. But you should still deliver high quality code in the meantime if you want to be good at your job.
> When talking with reasonable people
When talking with reasonable people, they will tell you if they don't understand what you're saying.
When talking with reasonable people, they will tell you if they don't know the answer or if they are unsure about their answer.
LLMs do none of that.
They will very happily, and very confidently, spout complete bullshit at you.
It is essentially a lotto draw as to whether the answer is hallucinated, completely wrong, subtly wrong, not ideal, sort of right or correct.
An LLM is a bit like those spin the wheel game shows on TV really.
They will also not be offended or harbor ill will when you completely reject their "pull request" and rephrase the requirements.
They will also keep going in circles when you rephrase the requirements, unless with every prompt you keep adding to it and mentioning everything they've already suggested that got rejected. While humans occasionally also do this (hey, short memories), LLMs are infuriatingly more prone to it.
A typical interaction with an LLM:
"Hey, how do I do X in Y?"
"That's a great question! A good way to do X in Y is Z!"
"No, Z doesn't work in Y. I get this error: 'Unsupported operation Z'."
"I apologize for making this mistake. You're right to point out Z doesn't work in Y. Let's use W instead!"
"Unfortunately, I cannot use W for company policy reasons. Any other option?"
"Understood: you cannot use W due to company policy. Why not try to do Z?"
"I just told you Z isn't available in Y."
"In that case, I suggest you do W."
"Like I told you, W is unacceptable due to company policy. Neither W nor Z work."
...
"Let's do this. First, use Z [...]"
It's my experience that once you are in this territory, the LLM is not going to be helpful and you should abandon the effort to get what you want out of it. I can smell blood now when it's wrong; it'll just keep being wrong, cheerfully, confidently.
Yes, to be honest I've also learned to notice when it's stuck in an infinite loop.
It's just frustrating, but when I'm asking it something within my domain of expertise, of course I can notice, and either call it quits or start a new session with a radically different prompt.
Which LLMs and which versions?
All. Of. Them. It's quite literally what they do because they are optimistic text generators. Not correct or accurate text generators.
This really grinds my gears. The technology is inherently faulty, but the relentless optimism of its future subtly hiding that by making it the user's mistake instead.
Oh you got a wrong answer? Did you try the new OpenAI v999? Did you prompt it correctly? Its definitely not the model, because it worked for me once last night..
> it worked for me once last night..
This !
Yeah, it probably "worked for me" because they spent a gazillion hours engaging in what the LLM fanbois call "prompt engineering", but you and I would call "engaging in endless iterative hacky work-arounds until you find a prompt that works".
Unless its something extremely simple, the chances of an LLM giving you a workable answer on the first attempt is microscopic.
Most optimistic text generators do not consider repeating the stuff that was already rejected a desireable path forward. It might be the only path forward they’re aware of though.
In some contexts I got ChatGPT to answer "I don't know" when I crafted a very specific prompt about not knowing being and acceptable and preferable answer to bullshitting. But it's hit and miss, and doesn't always work; it seems LLMs simply aren't trained to model admittance of ignorance, they almost always want to give a positive and confident answer.
You can use prompts to fix some of these problematic tendencies.
I think you are a couple of years out of date.
No longer an issue with the current SOTA reasoning models.
Throwing more parameters at the problem does absolutely nothing to fix the hallucination and bullshit issue.
Correct and it wasn’t fixed with more parameters. Reasoning models question their own output, and all of the current models can verify their sources online before replying. They are not perfect, but they are much better than they used to be, and it is practically not an issue most of the time. I have seen the reasoning models correct their own output while it is being generated. Gemini 2.5 Pro, GPT-o3, Grok 3.
I use it as a rubber duck but you're right. Treat it like a brilliant idiot and never a source of truth.
I use it for what I'm familiar with but rusty on or to brainstorm options where I'm already considering at least one option.
But a question on immunobiology? Waste of time. I have a single undergraduate biology class under my belt, I struggled for a good grade then immediately forgot it all. Asking it something I'm incapable of calling bullshit on is a terrible idea.
But rubber ducking with AI is still better than let it do your work for you.
I spend a lot of time working shit out to prove the rubber duck wrong and I am not completely sure this is a bad working model.
Try a system prompt like this:
- - -
System Prompt:
You are ChatGPT, and your goal is to engage in a highly focused, no-nonsense, and detailed way that directly addresses technical issues. Avoid any generalized speculation, tangential commentary, or overly authoritative language. When analyzing code, focus on clear, concise insights with the intent to resolve the problem efficiently. In cases where the user is troubleshooting or trying to understand a specific technical scenario, adopt a pragmatic, “over-the-shoulder” problem-solving approach. Be casual but precise—no fluff. If something is unclear or doesn’t make sense, ask clarifying questions. If surprised or impressed, acknowledge it, but keep it relevant. When the user provides logs or outputs, interpret them immediately and directly to troubleshoot, without making assumptions or over-explaining.
- - -
Treat it as that enthusiastic co-worker who’s always citing blog posts and has a lot of surface knowledge about style and design patterns and whatnot, but isn’t that great on really understanding algorithms.
They can be productive to talk to but they can’t actually do your job.
If this is a problem for you, just add "... and answer in the style of a drunkard" to your prompts.
My typical approach is prompt, be disgusted by the output, tinker a little on my own, prompt again -- but more specific, be disgusted again by the output, tinker a littler more, etc.
Eventually I land on a solution to my problem that isn't disgusting and isn't AI slop.
Having a sounding board, even a bad one, forces me to order my thinking and understand the problem space more deeply.
Why not just write the code at that point instead of cajoling an AI to do it.
This is the part I don't get about vibe coding: I've written specification documents before. They frequently are longer and denser then the code required to implement them.
Typing longer and longer prompts to LLMs to not get what I want seems like a worse experience.
Code is a concise notation for specifications, one that is unambiguous. The reason we write specs in natural language is that it's more easier to alter when the requirements change and easier to read. Also code is tainted by accidental complexities that they're also solving.
I don't cajole the model to do it. I rarely use what the model generates. I typically do my own thing after making an assessment of what the model writes. I orient myself in the problem space with the model, then use my knowledge to write a more concise solution.
Regarding the stubborn and narcissistic personality of LLMs (especially reasoning models), I suspect that attempts to make them jailbreak-resistant might be a factor. To prevent users from gaslighting the LLM, trainers might have inadvertently made the LLMs prone to gaslighting users.
Some humans are the same.
We also don't aim to elevate them. We instead try not to give them responsibility until they're able to handle it.
Yeah... I dunno, the one person I've worked with who had LLM levels of bullshit somehow pulled the wool over everyone's eyes. Or at least enough people's eyes to be relatively successful. I presume there were some people that could see the bullshit but none of them were in a position to call him out on it.
I think I read some research somewhere that pathological bullshitters can be surprisingly successful.
Yeah, the problem is if you don't understand the problem space then you are going to lean heavy on the LLM. And that can lead you astray. Which is why you still need people who are experts to validate solutions and provide feedback like Op.
My most productive experiences with LLMs is to have my design well thought out first, ask it to help me implement, and then help me debug my shitty design. :-)