Item 43994046

dkdbejwi383 • 2 days ago

How would an LLM “know” when it isn’t sure? Their baseline for truth is competent text, they don’t have a baseline for truth based on observed reality. That’s why they can be “tricked” into things like “Mr Bean is the president of the USA”

JustFinishedBSG • 2 days ago

It would "know" the same way it "knows" anything else: The probability of the sequence "I don't know" would be higher than the probability of any other sequence.

1 reply

Sharlin • 1 day ago

Exactly. It's easy to imagine a component in the net that the model is steered towards when nothing else has a high enough activation.

ben_w • 2 days ago

The answer is the same as how the messy bag of chemistry that is the human brain "knows" when it isn't sure:

Badly, and with great difficulty, so while it can just about be done, even then only kinda.

1 reply

foldr • 2 days ago

We really don’t understand the human brain well enough to have confidence that the mechanisms that cause people to respond with “I don’t know” are at all similar to the mechanisms which cause LLMs to give such responses. And there are quite a few prima facie reasons to think that they wouldn’t be the same.

2 replies

ben_w • 1 day ago

FWIW, I'm describing failure modes of a human, not mechanisms.

I also think "would" in the comment I'm replying to is closer to "could" than to "does".

1 reply

foldr • 1 day ago

Could you expand on that? What failure modes are we talking about exactly?

Sharlin • 1 day ago

The mechanics don't have to be similar, only analogous, in the morphology sense.

1 reply

foldr • 1 day ago

'Analogous in the morphology sense' is actually a more specific concept than 'similar'. But either way, we still don't know if they're analogous, or similar, or whatever term you prefer.

Anyone who actually understands both LLMs and the human brain well enough to make confident claims that they basically work the same really ought to put in the effort to write up a paper and get a Nobel prize or two.

1 reply

Sharlin • 1 day ago

Analogous in the morphology sense means having come up with an entirely distinct solution to a common problem. Insect and bird wings have little to do with each other except that both flap to create lift. It explicitly does not imply the solutions are similar in mechanism, although that can be, and often is, a result of convergent evolution, of course.

In particular, generally speaking (not claiming that LLMs a road to AGI, which is something I doubt) it's generally not a well-defensible philosophical position that the vertebrate brain (and remember that mammalian, bird and cephalopod brains are very different) is uniquely suited to produce what we call "intelligence".

> Anyone who actually understands both LLMs and the human brain well enough to make confident claims that they basically work the same

This is a strawman and not my position.

1 reply

foldr • 1 day ago

It was a characterization of the position of the post I was originally responding to, not your position.

I don’t think anyone in this discussion has claimed that brains are uniquely suited to producing intelligence. The point was just that we have no idea if there is any interesting correspondence between how LLMs work and how brains work, beyond superficial and obvious analogies.

saberience • 2 days ago

Humans can just as easily be tricked. Something like 25% of the American Electorate believed Obama was the antichrist.

So saying LLMs have no "baseline for truth" doesn't really mean much one way of the other, they are much smart and accurate than 99% of humans.