Item 44092822

>Can LLMs actually parse human languages?

IMHO, no, they have nothing approaching understanding. It's Chinese Rooms[1] all the way down, just with lots of bell and whistles. Spicy autocomplete.

1. https://en.wikipedia.org/wiki/Chinese_room

EGreg • 4 days ago

Actually, the LLMs made me realize John Searle’s “Chinese room” doesnt make much sense

Because languages have many similar concepts so the operator inside the Chinese room can understand nearly all the concepts without speaking Chinese.

And the LLM can translate to and from any language trivially, the inner layers do the actual understanding of concepts.

CamperBob2 • 5 days ago

Go ask the operator of a Chinese room to do some math they weren't taught in school, and see if the translation guide helps.

The analogy I've used before is a bright first-grader named Johnny. Johnny stumbles across a high school algebra book. Unless Johnny's last name is von Neumann, he isn't going to get anything out of that book. An LLM will.

So much for the Chinese Room.

5 replies

jmb99 • 5 days ago

> Go ask the operator of a Chinese room to do some math they weren't taught in school, and see if the translation guide helps.

That analogy only holds if LLMs can solve novel problems that can be proven to not exist in any form in their training material.

2 replies

CamperBob2 • 5 days ago

They do. Spend some time using a modern reasoning model. There is a class of interesting problems, nestled between trivial ones whose answers can simply be regurgitated and difficult ones that either yield nonsense or involve tool use, that transformer networks can absolutely, incontrovertibly reason about.

3 replies

SoftTalker • 4 days ago

Have any LLMs solved any of the big (or even lesser known) unanswered problems in math, physics, computer science?

It may appear that they are solving novel problems but given the size of their training set they have probably seen them. There are very few questions a person can come up with that haven't already been asked and answered somewhere.

1 reply

1bpp • 4 days ago

Google's AlphaEvolve recently produced a novel matrix multiplication function slightly faster than the previous state of the art that couldn't have been in any training data. While not a hard unsolved problem, I think it's good evidence that an LLM is capable of synthesizing new solutions to problems.

hyperadvanced • 4 days ago

Reason about: sure. Independently solve novel ones without extreme amounts of guidance: I have yet to see it.

Granted, for most language and programming tasks, you don’t need the latter, only the former.

1 reply

Workaccount2 • 4 days ago

99.9% of humans will never solve a novel problem. It's a bad benchmark to use here

2 replies

guappa • 4 days ago

But they will solve a problem novel to them, since they haven't read all of the text that exists.

hyperadvanced • 4 days ago

I agree. But it’s worth being somewhat skeptical of ASI scenarios if you can’t, for example, give a well formulated math problem to a LLM and it cannot solve it. Until we get a Reimann hypothesis calculator (or equivalent for hard/old unsolved maths) it’s kind of silly to be debating the extreme ends of AI cognition theory

1 reply

CamperBob2 • 4 days ago

"I'm taking this talking dog right back to the pound. It completely whiffed on both Riemann and Goldbach. And you should see the buffer overflows in the C++ code it wrote for me."

1 reply

hyperadvanced • 1 day ago

dog is a very different category man-made godlike super-intelligence

chillingeffect • 4 days ago

I have been able to get chatgpt to synthesize in the edges of two domains in ideaspace, say, psychology and economics, but surprisingly it struggled helping me write ODE code in go. In the first case, I think it actually synthesized. In the latter it couldn't extrapolate enough ideas from the two fields into one.

1 reply

majormajor • 4 days ago

How can you distinguish "I think it did something really impressive in the first case but not the second" from "it spat out something that looked interesting in both cases but in the latter case there was an objective criteria that exposed a lack of true understanding"?

It's famously easier to impress people with soft-sciences speculation than it is to impress the rules of math or compilers.

Workaccount2 • 4 days ago

I think people give training data too much credit. Obviously it's important, but it also isn't a database of knowledge like it's made out to be.

You can see this in riddles that are obviously in the training set, but older or lighter models still get them wrong. Or situations where the model gets them right, but uses a different method than the ones used in the training set.

dTal • 4 days ago

A "Chinese Room" absolutely will, because the original thought experiment proposed no performance limits on the setup - the Room is said to pass the Turing Test flawlessly.

People keep using "Chinese Room" to mean something it isn't and it's getting annoying. It is nothing more than a (flawed) intuition pump and should not be used as an analogy for anything, let alone LLMs. "It's a Chinese Room" is nonsensical unless there is literally an ACTUAL HUMAN in the setup somewhere - its argument, invalid as it is, is meaningless in its absence.

1 reply

CamperBob2 • 4 days ago

A Chinese Room has no attention model. The operator can look up symbolic and syntactical equivalences in both directions, English to Chinese and Chinese back to English, but they can't associate Chinese words with each other or arrive at broader inferences from doing so. An LLM can.

If I were to ask a Chinese room operator, "What would happen if gravity suddenly became half as strong while I'm drinking tea?," what would you expect as an answer?

Another question: if I were to ask "What would be an example of something a Chinese room's operator could not handle, that an actual Chinese human could?", what would you expect in response?

Claude gave me the first question in response to the second. That alone takes Chinese Rooms out of the realm of any discussion regarding LLMs, and vice versa. The thought experiment didn't prove anything when Searle came up with it, and it hasn't exactly aged well. Neither Searle nor Chomsky had any earthly idea that language was this powerful.

1 reply

dTal • 4 days ago

Where are you getting all this (wrong) detail about the internals of the Chinese Room? The thought experiment merely says that the operator consults "books" and follows "instructions" (no doubt Turing-complete but otherwise unspecified) for manipulating symbols they they explicitly DO NOT understand - they do NOT have access to "symbolic and syntactical equivalences" - that is the POINT of the thought experiment. But the instructions in the books in a Chinese Room could perfectly well have an attention model. The details are irrelevant, because - I stress again - Searle's Chinese Room is not cognitively limited, by definition. Its hypothetical output is indistinguishable from a Chinese human.

I tend to agree that Chinese Rooms should be kept out of LLM discussions. In addition to it being a flawed thought experiment, of all the dozens of times I've seen them brought up, not a single example has demonstrated understanding of what a Chinese Room is anyway.

1 reply

CamperBob2 • 4 days ago

The details are irrelevant, because - I stress again - Searle's Chinese Room is not cognitively limited, by definition.

So said Searle. But without specifying what he meant, it was a circular statement at best. Punting to "it passes a Turing Test" just turns it into a different debate about a different flawed test.

The operator has no idea what he's doing. He doesn't know Chinese. He has a Borges-scale library of Chinese books and a symbol-to-symbol translation guide. He can do nothing but manipulate symbols he doesn't understand. How anyone can pass a well-administered Turing test without state retention and context-based reflection, I don't know, but we've already put more thought into this than Searle did.

codr7 • 4 days ago

Give Johnny a copier and a pair of scissors and he will be able to perform more or less the same; and likely get more out of it as well, since he has a clue what he is doing.

geysersam • 4 days ago

How can you make that claim? Have you ever used an LLM that hasn't encountered high school algebra in it's training data? I don't think so.

1 reply

Groxx • 4 days ago

I have at least encountered many LLMs with many school's worth of algebra knowledge, but fail miserably at algebra problems.

Similarly, they've ingested human-centuries or more of spelling bee related text, but can't reliably count the number of Rs in strawberry. (yes, I understand tokenization is to blame for a large part of this. perhaps that kind of limitation applies to other things too?)

1 reply

CamperBob2 • 4 days ago

Similarly, they've ingested human-centuries or more of spelling bee related text, but can't reliably count the number of Rs in strawberry

Sigh

1 reply

Groxx • 4 days ago

That sigh might be a chronic condition, if it's happening even when people demonstrate a decent understanding of the causes. You may want to get that looked at.

xwolfi • 5 days ago

An LLM will get ... what exactly ? The ability to reorder its sentences ? The LLM doesn't think, doesn't understand, doesn't know what matters more than not, doesn't use what it learns, doesn't expand what it learns to new knowledge, doesn't enjoy reading that book and doesn't suffer through it.

So what is it really gonna do with a book, that LLM ? Reorder its internal matrix to be a little bit more precise when autocompleting sentences sounding like the book ? We could build an nvidia cluster the size of the Sun and it would repeat sentences back to us in unbelievable ways but would still be unable to take a knowledge-based decision, I fear.

So what are we in awe at exactly ? A pretty parrot.

The day the Chinese room metaphor disappears is when ChatGPT replies to you that your question is so boring it doesn't want to expend the resources to think about it. But it'd be ready to talk about this or that, that it's currently trying to get better at. When it finally has agency over its own intelligence. When it acquires a purpose.

2 replies

morsecodist • 4 days ago

This isn't really the meaning of the Chinese room. The Chinese room presupposes that the output is identical to that of a speaker who understands the language. It is not arguing that there is any sort of limit to what an AI can do with its output and it is compatible with the AI refusing to answer or wanting to talk about something else.

AIorNot • 4 days ago

LLM models are to a large extent neuronal analogs of human neural architecture

- of course they reason

The claim of the “stochastic parrot” needs to go away

Eg see: https://www.anthropic.com/news/golden-gate-claude

I think the rub is that people think you need consciousness to do reasoning, I’m NOT claiming LLMs have consciousness or awareness

2 replies

xwolfi • 4 days ago

They are really not neuronal analogs, reasoning is far from what they do. If they reasoned, they'd stick to their guns more readily, but try to contradict an LLM and it will make any logic leap you ask it too.

If you debate with me, I'll keep reasoning on the same premises and usually the difference between two humans is not in reasoning but in choice of premises.

For instance you really want here to assert that LLM are close to human, I want to assert they're not - truth is probably in between but we chose two camps. We'll then reason from these premises, reach antagonistic conclusions and slowly try to attack each other point.

An LLM cannot do that, it cannot attack your point very well, it doesn't know how to say you're wrong, because it doesn't care anyway. It just completes your sentences, so if you say "now you're wrong, change your mind" it will, which sounds far from reasoning to me, and quite unreasonable in fact.

3 replies

Workaccount2 • 4 days ago

Gemini 2.5 will tell you when you're wrong. It's the first model to do so.

johnb231 • 4 days ago

> An LLM cannot do that, it cannot attack your point very well, it doesn't know how to say you're wrong, because it doesn't care anyway. It just completes your sentences, so if you say "now you're wrong, change your mind" it will, which sounds far from reasoning to me, and quite unreasonable in fact.

That is absolute bullshit. Go try any frontier reasoning model such as Gemini 2.5 Pro or GPT-o3 and see how that goes. They will inform you that you are full of shit.

Do you understand that they are deep learning models with hundreds of layers and trillions of parameters? They have learned patterns of reasoning, and can emulate human reasoning well enough to call you out on that nonsense.

Workaccount2 • 4 days ago

Gemini 2.5 will tell you when you're wrong

otabdeveloper4 • 4 days ago

> LLM models are to a large extent neuronal analogs of human neural architecture

They are absolutely not. Despite the disingenuous name, computer neural nets are nothing like biological brains.

(Neural nets are a generalization of the logistic regression.)