How would an LLM “know” when it isn’t sure? Their baseline for truth is competent text, they don’t have a baseline for truth based on observed reality. That’s why they can be “tricked” into things like “Mr Bean is the president of the USA”
It would "know" the same way it "knows" anything else: The probability of the sequence "I don't know" would be higher than the probability of any other sequence.
Exactly. It's easy to imagine a component in the net that the model is steered towards when nothing else has a high enough activation.
The answer is the same as how the messy bag of chemistry that is the human brain "knows" when it isn't sure:
Badly, and with great difficulty, so while it can just about be done, even then only kinda.
We really don’t understand the human brain well enough to have confidence that the mechanisms that cause people to respond with “I don’t know” are at all similar to the mechanisms which cause LLMs to give such responses. And there are quite a few prima facie reasons to think that they wouldn’t be the same.
The mechanics don't have to be similar, only analogous, in the morphology sense.
'Analogous in the morphology sense' is actually a more specific concept than 'similar'. But either way, we still don't know if they're analogous, or similar, or whatever term you prefer.
Anyone who actually understands both LLMs and the human brain well enough to make confident claims that they basically work the same really ought to put in the effort to write up a paper and get a Nobel prize or two.
Analogous in the morphology sense means having come up with an entirely distinct solution to a common problem. Insect and bird wings have little to do with each other except that both flap to create lift. It explicitly does not imply the solutions are similar in mechanism, although that can be, and often is, a result of convergent evolution, of course.
In particular, generally speaking (not claiming that LLMs a road to AGI, which is something I doubt) it's generally not a well-defensible philosophical position that the vertebrate brain (and remember that mammalian, bird and cephalopod brains are very different) is uniquely suited to produce what we call "intelligence".
> Anyone who actually understands both LLMs and the human brain well enough to make confident claims that they basically work the same
This is a strawman and not my position.
It was a characterization of the position of the post I was originally responding to, not your position.
I don’t think anyone in this discussion has claimed that brains are uniquely suited to producing intelligence. The point was just that we have no idea if there is any interesting correspondence between how LLMs work and how brains work, beyond superficial and obvious analogies.
Humans can just as easily be tricked. Something like 25% of the American Electorate believed Obama was the antichrist.
So saying LLMs have no "baseline for truth" doesn't really mean much one way of the other, they are much smart and accurate than 99% of humans.