namaria 4 days ago

> E.g it can be argued that the latest LLMs like Gemini 2.5 and Claude 4 in fact do complex reasoning.

They most definitely don't. We attach symbolic meaning to their output because we can map it semantically to the input we gave it. Which is why people are often caught by surprise when these mappings break down.

LLMs can emulate reasoning, but the failure modes show that they don't. We can get them to be coincidentally emulating reasoning well enough long enough to fools us, investors and the media. But doubling down on it hoping that this problem goes away with scale or fine tuning is proving more and more reckless.

1
jamincan 4 days ago

Humans aren't infallible and make mistakes in reasoning as well. What is fundamentally different about the mistakes we make versus the mistakes that Claude or Gemini make? Haven't LLM's even been shown to make the same posthoc rationalizations of mistakes that we as humans do all the time?

namaria 4 days ago

Unless you're pulling humans out of the streets at random and asking them questions or to do work, I guess you also shouldn't do that with statistical models of random human language.