thatjoeoverthr 2 days ago

There are a few problems with an „I don’t know” sample. For starters, what does it map to? Recall, the corpus consists of information we have (affirmatively). You would need to invent a corpus of false stimuli. What you would have, then, is a model that is writing „I don’t know” based on whether the stimulus better matches something real, or one of the negatives.

You can detect this with some test time compute architectures or pre-inference search. But that’s the broader application. This is a trick for the model alone.

1
dlivingston 2 days ago

The Chain of Thought in the reasoning models (o3, R1, ...) will actually express some self-doubt and backtrack on ideas. That tells me there's a least some capability for self-doubt in LLMs.

genewitch 1 day ago

That's not sslf-doubt, that's programmed in.

A Poorman's "thinking" hack was to edit the context of the ai reply to where you wanted it to think and truncate it there, and append a carriage return and "Wait..." Then hit generate.

It was expensive because editing context isn't, you have to resend (and it has to re-parse) the entire context.

This was injected into the thinking models, I hope programmatically.