halgir 1 day ago

I use it in much the same way as you, and it's been extremely beneficial. But I also would not dream of signing my name on something that has been independently produced by AI, it's just too often blatantly wrong on specifics.

I think people who do are simply not aware that AI is not deterministic the same way a calculator is. I would feel entirely safe signing my name on a mathematical result produced by a calculator (assuming I trusted my own input).

1
mrob 22 hours ago

LLMs are deterministic [0]. An LLM is a pure function that takes a list of tokens and returns a set of token probabilities. To make it "chat" you use the generated probabilities to pick a token, append that token to the list, and run the LLM again. Any randomness is introduced by the external component that picks a token using the probabilities: the sampler. Always picking the most likely token is a valid strategy.

The problem is that all output is a "hallucination", and only some of it coincidentally matches the truth. There's no internal distinction between hallucination and truth.

[0] Theoretically; race conditions in a parallel implementation could add non-determinism.

ijk 21 hours ago

True, though in practice speed optimizations and instabilities on the GPU often lead to LLMs being very non-determanistic in practice.

Which doesn't detract from your main point: there's not a lot of distinction between hallucinations and what we'd consider to be the "real thing." There have been various attempts to measure hallucinations, and we can figure out things like how confident the model is in a particular answer...but there's nothing grounding that answer. Saturate the dataset with the wrong answer and you'll get an overconfident wrong result.

jdlshore 21 hours ago

While this is technically correct, everyday use of LLMs involves a non-zero temperature, so they (the whole package that people think of as “AI”) are non-deterministic in practice.

koakuma-chan 21 hours ago

No, hallucinations occur when LLM is missing information.

swores 19 hours ago

That's not correct, and seems to be based on a common misunderstanding of how LLMs work, the rough idea being that when the info the model is being asked for had been in the data used for training, it "looks it up" not unlike software looking up info from a huge database of general knowledge, and that when that lookup fails it falls back to making stuff up. But that's wrong, the models are actually doing the exact same thing when they're hallucinating as when they're correct, just the result is different.

Hallucinations happen when the model determines that the most likely suitable string of tokens turns out to contain incorrect information, regardless of whether the correct information is "missing" or whether the correct information actually would have been outputted had it, when selecting the first token of the response, instead selected the option that it considers second best rather than best.

Whether or not a piece of information was in the training set can obviously influence the likelihood of a model hallucinating when asked about the subject, but it can easily hallucinate about stuff that was in the training and it can also get things right that weren't in the training data.

koakuma-chan 18 hours ago

If an LLM happens to know the answer to your question, that answer will have the greatest weight, and will therefore become a non-hallucinated output. Otherwise the output will be hallucinated. Note that a hallucination may manifest as an attempt to extrapolate, which may be successful. If you query an LLM with prior knowledge that the LLM doesn’t know the answer, you are guaranteed to receive a hallucinated output.

Or at least this is how I interpret the term.

swores 12 hours ago

But that's not how they actually work.

> "If an LLM happens to know the answer to your question, that answer will have the greatest weight"

An LLM doesn’t “know” anything in the way you’re imagining. It doesn’t have stored facts or indexed knowledge to check against, it just has weights learned between token sequences, and it outputs whatever next token is assigned the highest probability given the prompt and prior context. That might happen to produce a correct answer (and people are obviously working hard to make the models produce right answers as often as possible), but it might just as easily produce a plausible-sounding but wrong one, even if the correct information was in the training data. Because that correct information being there doesn't guarantee it will have the highest weighting ever, yet alone the highest weighting in all contexts of previous tokens and in all temperature settings.

You’re right that hallucinations can sometimes look like “extrapolations” that happen to land correctly, but that’s incidental. It’s still doing the same token-by-token probability selection regardless of whether it ends up right or wrong.

Framing it around “missing knowledge” vs “existing knowledge” is misleading intuition. It’s better to think about it in terms of probability distributions over token sequences: the model’s training biases it toward correct sequences more often than incorrect ones, but there’s nothing fundamental in the architecture that guarantees that if the answer was present in training, it will always beat out wrong guesses.

p.s. It's late at night here and I'm about to go to bed, so apologies if I've not explained well in this comment - I gave it to ChatGPT hoping it could tidy things up for me and it just made a way more confusing version so I'm posting it as is :D Let me know if my explanation still isn't clear and I could try again, or answer any questions you have, tomorrow

koakuma-chan 10 hours ago

> An LLM doesn’t “know” anything in the way you’re imagining. It doesn’t have stored facts or indexed knowledge to check against

Neither does your brain and yet you do "know" something.

> but it might just as easily produce a plausible-sounding but wrong one, even if the correct information was in the training data

If the majority of information that was in the LLM's training data said 1 + 1 = 3, the LLM will tell you that 1 + 1 = 3, even if there was some information that said 1 + 1 = 2, and there's nothing wrong with that because the LLM is not supposed to fact-check.

> the model’s training biases it toward correct sequences more often than incorrect ones

No, the model's training biases it toward sequences that appear more frequently.

BoorishBears 6 hours ago

It's trivial to prove this is wrong: invert relationships it knows about and it fails to answer based on knowledge it previously demonstrated (even with loads of hints)

https://chatgpt.com/share/680dc86c-f0dc-800d-9f04-57ba2f126a...

https://chatgpt.com/share/680dc90b-de28-800d-92b6-f2ef824777...

Note how applying increasing pressure to answer was what caused the hallucination: hallucinations aren't tied to if the model "knows" something.

Once the tokens output don't fall into the start of some varation of "I don't know", the model is going to answer regardless of what it knows.

heylook 12 hours ago

> If an LLM happens to know the answer to your question

You're missing the point. It doesn't "know" anything. The only thing it can "know" is the statistical relationships between tokens in its dataset. It doesn't "know" anything about the meaning of those tokens. It doesn't even "know" whether it "knows" anything or not. The best it can do is "Here's a recursively generated string of ASCII codes that are statistically likely to follow each other according to the data corpus."

It's Rashomon. It can point you in the right directions a lot of the time, but there's no getting around the fact that you have to double-check its answers with external sources.

> Or at least this is how I interpret the term.

That's not a very useful interpretation because it's not grounded in technical reality.

koakuma-chan 10 hours ago

> It doesn't "know" anything.

The word know is an abstraction I use in order to avoid going into technical details.

> That's not a very useful interpretation because it's not grounded in technical reality.

My interpretation aligns with what people generally mean by hallucination, and it's definitely more useful than saying that any output is hallucination.

swores 5 hours ago

The difference is: what people generally mean by hallucination is "LLM said something wrong as if it was right". And what you are adding to that in your previous comments is the concept of whether or not the LLM knows the right answer. Which it never does. That's where your interpretation and the general interpretation differ.

I'm afraid I don't personally see how to explain more clearly, so will just say instead that given multiple people are in this thread telling you your understanding of how LLMs work isn't right, please consider that to at least be a possibility and look into it further rather than digging deeper into your current beliefs.

BlueTemplar 19 hours ago

But then isn't this also technically true that any software including a pseudo-random number generator is deterministic ? (Starting with itself, like that sampler you mention ?)

And while it might be important in some contexts, like debugging using either the exact same or different seeds, isn't this one of them where it rather confuses the issue ?