Seems like this is an aspect of their well-known overconfidence and the inability to self-reflect and recognize they have to ask for more details because their priors are too low. If you look at the output of reasoning models, it’s clear that the idea of asking for clarification very rarely occurs to them – when they’re confused, it’s just endless speculation of what the user might have meant.
This, of course, has certain implications as to the wisdom of the idea of “replacing human programmers”, given that one of the hard parts of the trade is trying to turn vague and often confused ideas into precise specifications by interacting with the shareholders.
> inability to self-reflect
IMO the One Weird Trick for LLMs is recognizing that there's no real entity, and that users are being tricked into a suspended-disbelief story.
In most cases cases you're contributing text-lines for a User-character in a movie-script document, and the LLM algorithm is periodically triggered to autocomplete incomplete lines for a Chatbot character.
You can have an interview with a vampire DraculaBot, but that character can only "self-reflect" in the same shallow/fictional way that it can "thirst for blood" or "turn into a cloud of bats."
Not to mention that vampires don’t reflect. ;)
Haha, true... however unlike LLMs, folklore tells us they can count! (Obsessively.)
This is a tired semantic argument that does not bring any insight into the discussion. A token-predictor could still be trained to predict the tokens “I’m not sure what you mean because of points x, y, and z; could you elaborate?”
It's not a tired argument, and not just a semantic one it's a foundational characteristic of LLM.
> A token-predictor could still be trained to predict the tokens “I’m not sure what you mean because of points x, y, and z; could you elaborate?”
This is entirely true, and the key insight is even right in your sentence but you don't seem to grasp it. “could still be trained”: you can train an LLM into doing whatever you want it to, but you have to train it specifically for that!
In the beginning of LLM we witnessed this impressive phenomenon where the LLM exhibited emergent capabilities (I'm particularly thinking about LLMs being few shots learners about stuff that wasn't in their training corpus). And these emergent capabilities legitimately raised the question about “how intelligent these things are, really”.
But for the past three years, the key lesson is that this kind of emergent effect is too small to be useful, and the focus has been put towards creating purposely built datasets (with tons of “artificial data”) to train the model to explicitly do things we want it to do. And it works pretty well, as models' capabilities kept improving at a fast pace (and in particular, I don't see would we couldn't overcome the problem highlighted by this paper, with more synthetic data specifically designed for multi-turn conversation). But their progress is now strictly limited by their makers' own intelligence. You cannot just scrap the web throw compute at the problem and expect emergent intelligence to occur anymore. It's more “simulated intelligence” than “artificial intelligence”, really.
It's definitely a tired and semantical one because as he said, it brings no insight and is not even good at the analogy level. I can't have a conversation with Dracula and Dracula can't make decisions that affect the real world, so LLMs already break key aspects and assumptions of the 'Document Simulator'.
Pre-trained LLMs will ask clarifying questions just fine. So I think this is just another consequence of post-training recipes.
> Dracula can't make decisions that affect the real world, so LLMs already break key aspects and assumptions of the 'Document Simulator'.
Nonsense, we are already surrounded by mindless algorithms (and their outputs) that "affect the real world" because many of us have full-time jobs ensuring it happens! "
When someone uses a SimCity-esque program to generate a spreadsheet used for real-world bus schedules, does that "break key aspects and assumptions of a traffic simulator"? Does the downstream effect elevate it to a microcosm of tiny lives? Nope!
You’re talking past the point I was making.
My point about Dracula isn't just that he's fictional, but that he cannot make decisions that have unscripted consequences in the real world, nor can he engage in a novel, interactive conversation. Dracula, as a character, only "acts" or "speaks" as an author (or game designer, etc.) has already written or programmed him to. He has no independent capacity to assess a new situation and generate a novel response that affects anything beyond his fictional context. If I "talk" to Dracula in a game, the game developers have pre-scripted his possible responses. The text of Dracula is immutable.
A LLM, by contrast, performs fresh inference every time it’s prompted: it weighs competing continuations and selects one. That selection is a bona-fide decision (a branch taken at run-time). The “document-simulator” picture collapses that distinction, treating a dynamic decision process as if it were a block of pre-written prose. It's just nonsensical.
Your SimCity example is open loop: the simulation runs, a human inspects the results, and then decides whether to publish new bus schedules. Nothing in the simulator is tasked with interrogating the human, updating its model of their intent, or steering the outcome. In production LLM systems the loop is often closed: the model (often with tool-wrapper code) directly drafts emails, modifies configs, triggers API calls, or at minimum interrogates the user (“What city are we talking about?”) before emitting an answer.
Your argument is tired and semantical because it fails at the most fundamental level - It's not even a good analogy.
> LLMs already break key aspects and assumptions of the 'Document Simulator'. [...] The “document-simulator” picture collapses that distinction, treating a dynamic decision process as if it were a block of pre-written prose. It's just nonsensical.
I feel you've erected a strawman under your this "document simulator" phrase of yours, something you've arbitrarily defined as a strictly one-shot process for creating an immutable document. Yeah, it's boring and "nonsensical" because you made it that way.
In contrast, everybody else here has been busy talking about iterative systems which do permit interaction, because the document is grown via alternate passes of (A) new content from external systems or humans and (B) new content predicted by the LLM.
I’m not arbitrarily defining it as a one-shot process. I’m pointing out how strained your “movie-script” (your words, not mine) comparison is.
>You can have an interview with a vampire DraculaBot, but that character can only "self-reflect" in the same shallow/fictional way that it can "thirst for blood" or "turn into a cloud of bats."
The "shallow/fictional way" only exists because of the limited, immutable nature of real scripts. A 'script' that does not have either of these properties would not necessarily produce characters that only reflect in a shallow manner.
Text that’s generated on-the-fly-while interrogating the user, calling tools, and updating its own working context-isn’t anything like a screenplay whose pages are fixed in advance.
There's no strawman here. You've decided that an LLM is not something you want to attribute a 'real' entity to and this is your rationalization for that.
> I’m pointing out how strained your “movie-script” (your words, not mine) comparison is. [...] the limited, immutable nature of real scripts [...] a screenplay whose pages are fixed in advance.
You are confused and again attacking an idea nobody else has advanced.
Even in my very first comment starting the thread, I explicitly stated that the "movie-script" is mutable, with alternate phases of "contributing" and "autocompleted" content as it grows.
Seriously what's so hard to understand that the things you are claiming are the result of a LLM that is analogous to a script are only properties of the kinds of scripts LLMs are not (and so have no leg to stand on)?
This is not a hard concept to grasp. I know what you are claiming. It doesn't automatically make your argument sound.
To call something that does not have the properties of a script a script is odd in the first place, but to realize that and still assume behaviors that are only the result of the properties you realize are not even present in your new 'script' is just bizzare.
I'm not confused. You are.
It means if you want something resembling a self-introspective theory of mind, you need to arrange the overall document to cohere to documents where such things are/appear-to-be happening.
This leads us to new questions: How can we characterize and identify real-world documents which fit? How can we determine what features may be significant, and which of those can be easily transplanted to our use-case?
There are a lot of words but it feels like you have never really used LLM's (apologies for the bluntness).
We see LLM's introspecting all the time[1].
>Notably, DeepSeek-AI et al. report that the average response length and downstreamperformance of DeepSeek-R1-Zero increases as training progresses. They further report an “aha moment” during training, which refers to the “emergence” of the model’s ability to reconsider its previously generated content. As we show in Section 3.2, this reconsideration behaviour is often indicated by the generation of phrases such as ‘wait, ...’ or ‘alternatively, ...’
You are just doubling down on protecting your argument.
I operate LLMs in many conversational modes where it does ask clarifying questions, probing questions, baseline determining questions.
It takes at most one sentence in the prompt to get them to act this way.
> It takes at most one sentence in the prompt to get them to act this way.
What is this one sentence you are using?
I am struggling to elicite clarification behavior form llms
Could you share your prompt to get it to ask clarifying questions? I'm wondering if it would work in custom instructions.
It is domain dependent, you really need to play with it. Tell it you are doing pair thinking and either get it to ask questions about things it doesn't understand, or get it to ask you questions to get you to think better. Project the AI into a vantage point in the latent space and then get it to behave in the way that you want it to.
You can ask it to use the Socratic method, but then it is probing you, not its own understanding. Now have it use the socratic method on itself. You can tell it to have multiple simultaneous minds.
Play with deepseek in thinking and non-thinking mode, give it nebulous prompts and see if you can get it to ask for clarifications.
It could be trained to say that, but it's not exactly clear how you would reinforce the absence of certain training data in order to emit that response accurately, rather than just based on embedding proximity.
Why does it seem so hard to make training data for this? You can cook up a few thousands of training data and do an RLHF.
Yes, but all that does is locate "I don't know" near the cooked up data within the embeddings. This doesn't actually reflect an absence of data in the training.
Seems easy. Have a set of vague requests and train it to ask for clarification instead of guessing.
As I said, it's possible to train it to ask for clarification, but it's not clear how to reinforce that response in a way that correctly maps on to the absence of data rather than arbitrary embedding proximity. You can't explicitly train on every possible scenario where the AI should recognize its lack of knowledge.
If the solution were easy or obvious the problem would likely have already been solved no?
We've only had ChatGPT and the like for a few years. It took Ford longer to make automatic transmissions.
So it is hard? Not easy? I would agree with that position. I think the analogy with automatic transmissions misses though. Programming actual intelligence into a computer seems orders of magnitude more complex and difficult than building the gearbox for a car.
I'm saying it shouldn't be that hard, but it's just one of a long list of features that the people whose job it is to do are working on.
It is hard in the sense that it's an unsolved problem that emerges due to the way LLMs work. Perhaps some clever ML PhD will come up with a technique to solve it, but right now there's no clear solution.
How does it identify what's vague?
Many ways. 1) Hire some humans to label the data. 2) Let the user give you feedback. 3) Ask another LLM.
How would an LLM “know” when it isn’t sure? Their baseline for truth is competent text, they don’t have a baseline for truth based on observed reality. That’s why they can be “tricked” into things like “Mr Bean is the president of the USA”
It would "know" the same way it "knows" anything else: The probability of the sequence "I don't know" would be higher than the probability of any other sequence.
Exactly. It's easy to imagine a component in the net that the model is steered towards when nothing else has a high enough activation.
The answer is the same as how the messy bag of chemistry that is the human brain "knows" when it isn't sure:
Badly, and with great difficulty, so while it can just about be done, even then only kinda.
We really don’t understand the human brain well enough to have confidence that the mechanisms that cause people to respond with “I don’t know” are at all similar to the mechanisms which cause LLMs to give such responses. And there are quite a few prima facie reasons to think that they wouldn’t be the same.
The mechanics don't have to be similar, only analogous, in the morphology sense.
'Analogous in the morphology sense' is actually a more specific concept than 'similar'. But either way, we still don't know if they're analogous, or similar, or whatever term you prefer.
Anyone who actually understands both LLMs and the human brain well enough to make confident claims that they basically work the same really ought to put in the effort to write up a paper and get a Nobel prize or two.
Analogous in the morphology sense means having come up with an entirely distinct solution to a common problem. Insect and bird wings have little to do with each other except that both flap to create lift. It explicitly does not imply the solutions are similar in mechanism, although that can be, and often is, a result of convergent evolution, of course.
In particular, generally speaking (not claiming that LLMs a road to AGI, which is something I doubt) it's generally not a well-defensible philosophical position that the vertebrate brain (and remember that mammalian, bird and cephalopod brains are very different) is uniquely suited to produce what we call "intelligence".
> Anyone who actually understands both LLMs and the human brain well enough to make confident claims that they basically work the same
This is a strawman and not my position.
It was a characterization of the position of the post I was originally responding to, not your position.
I don’t think anyone in this discussion has claimed that brains are uniquely suited to producing intelligence. The point was just that we have no idea if there is any interesting correspondence between how LLMs work and how brains work, beyond superficial and obvious analogies.
Humans can just as easily be tricked. Something like 25% of the American Electorate believed Obama was the antichrist.
So saying LLMs have no "baseline for truth" doesn't really mean much one way of the other, they are much smart and accurate than 99% of humans.
I agree that it's a tired argument, but there appears to be two separate things being discussed in this little corner of HN. Clarity in the problem it's being asked to solve, and confidence that the answer it has is correct.
I can trivially get any of the foundational models to ask me clarifying questions. I've never had one respond with 'I don't know'.
I've gotten lots of responses like "with the information you provided, I cannot answer that. Can you provide more information?"
Which IMO is the name as "idk"
Anthropic found that it Claude will pretend that it used the "standard" way to do addition- add the digits, carry the 1, etc- but the pattern of activations showed it using a completely different algorithm. So these things can role play as introspecting- they come up with plausible post-hoc explanations for their output- but they are still just pretending, so they will get it wrong.
So you can teach a model to sometimes ask for clarification, but will it actually have insight into when it really needs it, or will it just interject for clarification more or less at random? These models have really awful insight into their own capabilities, ChatGPT eg insists to me that it can read braille, and then cheerfully generates a pure hallucination.
> Anthropic found that it Claude will pretend that it used the "standard" way to do addition- add the digits, carry the 1, etc- but the pattern of activations showed it using a completely different algorithm.
That doesn't mean much; humans sometimes do the same thing. I recall a fun story about a mathematician with synesthesia multiplying numbers by mixing the colours together. With a bit of training such a person could also pretend to be executing a normal algorithm for the purposes of passing tests.
Even then the human doesn't know how they execute the algorithm, or mix the colours together - our conscious self-reflective mind has limits as to how far into our neural network weights it can reach. Can get further with lots of meditation, but it is still definitionally limited (in information theory terms).
I disagree, it's a very insightful comment.
The problem is that any information about any internal processes used to generate a particular token is lost; the LLM is stateless, apart from the generated text. If you ask an LLM-character (which I agree should be held distinct from the LLM itself and exists at a different layer of abstraction) why it said something, the best it can do is a post-hoc guess. The "character", and any internal state we might wish it to have, only exists insofar as it can be derived anew from the text.
I certainly agree with the point about post-hoc justifications – but isn't it amazing that it's also something very familiar to humans who do that all the time and manage to lie to ourselves about it very convincingly?! The more you read about neuropsychology the more you're forced to assume a view where the conscious self, whatever it is, has only a very tenuous grasp of what is going on and how much it actually has control over things.
In any case, you don't need accurate understanding of how your mind works (hello humans, again!) to be able to converge on
INSUFFICIENT DATA FOR A MEANINGFUL ANSWER
when there's no other uniquely good local optimum in the search space. The inability of LLMs of ask for clarification was exactly the flaw we encountered when testing them on open-ended problems, stated somewhat ambiguously. This was in the context of paradoxical situations, tested on DeepSeek-R1 and Claude-3.7-Sonnet. Blog post about our experiments: https://pankajpansari.github.io/posts/paradoxes/
> Seems like this is an aspect of their well-known overconfidence and the inability to self-reflect and recognize they have to ask for more details because their priors are too low.
When I read this I feel like I'm witnessing intelligent people get fooled by a better Emacs doctor. It is not reflecting, it is not confident. It is "just" proposing text completion. That is why once the completion starts being bad you have to start anew. It does not have any concept of anything just a huge blob of words and possible follow-up from what the texts used to train it show.
Real programmers spend a ton of time just figuring out what people actually want. LLMs still treat guessing as a feature
This cartoon needs an update for what an LLM came up with:
https://www.reddit.com/r/comics/comments/1l5tbc/update_to_th...
> This, of course, has certain implications as to the wisdom of the idea of “replacing human programmers”
Ironically, working with a junior dev is a lot like this -- setting them on a task, then coming back later with dogs and flashlights to retrieve them from the deep woods they've inevitably lost themselves in by just forging ahead, making assumptions, and asking no questions.
Isn’t this relatively trivial to correct? Just like chain of thought reasoning replaces end tokens with “hmm” to continue the thought can’t users just replace the llm tokens whenever it starts saying “maybe they are referring to” with something like. “Let me ask a clarifying question before I proceed.”
Indeed, I was just about to edit my comment because the same occurred to me. Someone is probably going to try just that soon enough.
> inability to self-reflect and recognize they have to ask for more details because their priors are too low.
Gemini 2.5 Pro and ChatGPT-o3 have often asked me to provide additional details before doing a requested task. Gemini sometimes comes up with multiple options and requests my input before doing the task.
Gemini is also the first model I have seen call me out in it's thinking. Stuff like "The user suggested we take approach ABC, but I don't think the user fully understands ABC, I will suggest XYZ as an alternative since it would be a better fit"
It is impressive when it finds subtle errors in complex reasoning.
But even the dumbest model will call you out if you ask it something like:
"Hey I'm going to fill up my petrol car with diesel to make it faster. What brand of diesel do you recommend?"
That's a recent development for (imho) higher engagement and reduced compute.
It's for higher quality of output. Better solutions. These are the state of the art reasoning models (subscription only, no free access) which are smarter.
It also mainly happens when the context is clear that we are collaborating on work that will require multiple iterations of review and feedback, like drafting chapters of a handbook.
I have seen ChatGPT ask questions immediately upfront when it relates to medical issues.
Close. Higher engagement means the user is more invested and values the solution more.
The users are being engineered more than the models are, and this isn't the only example.
Are you employed at Google or OpenAI? Are you working on these frontier models?
In the case of medical questions it needs to know further details to provide a relevant diagnosis. That is how it was trained.
In other cases you can observe its reasoning process to see why it would decide to request further details.
I have never seen an LLM just ask questions for the sake of asking. It is always relevant in the context. I don't use them casually. Just wrote a couple of handbooks (~100 pages in a few days). Generating tens of thousands of tokens per session with Gemini.
typical patterns to look out for:
- "Should I now give you the complete [result], fulfilling [all your demands]?"
- "Just say [go] and I will do it"
- "Do you want either [A, B, or C]"
- "In [5-15] minutes I will give you the complete result"
...
> "Do you want either [A, B, or C]"
That's an example of what I'm talking about. Watch the reasoning process produce multiple options. That's what it is trained to do. That is problem solving, not "engagement". It requires more compute, not less. You see that more with the expensive models.
> "In [5-15] minutes I will give you the complete result"
I haven't seen that before and I don't see how it's relevant.
> That's an example of what I'm talking about. Watch the reasoning process produce multiple options. That's what it is trained to do. That is problem solving, not "engagement". It requires more compute, not less. You see that more with the expensive models.
Fair point. Thanks for standing your ground and arguing so matter-of-factly with me! Appreciate it.
I have never been thanked for replying here before. Thanks.
The optional choices happen when it tries to reason out a solution, but then finds it is making too many assumptions of unknown details about the user's system, preferences, goals, and so on. It's just a thought pattern that it has learned to emulate.
People here will argue that LLM's cannot truly "think", but they are good enough at emulating thinking.
> and the inability to self-reflect and recognize they have to ask for more details
They're great at both tasks, you just have to ask them to do it.
You can certainly convince them to ask for details, but I'm not sure whether that makes them any good at knowing when exactly to ask vs just asking some percentage of the time regardless.
That is, does it actually know when it doesn't know, or are you just making it less confident overall, so it asks questions with no actual insight? Convincing a model to roleplay as someone who doesn't know things vs teaching a model to have insight into when it does and doesn't need clarification seems like a tough one.