Item 44092895

jmb99 • 5 days ago

> Go ask the operator of a Chinese room to do some math they weren't taught in school, and see if the translation guide helps.

That analogy only holds if LLMs can solve novel problems that can be proven to not exist in any form in their training material.

CamperBob2 • 5 days ago

They do. Spend some time using a modern reasoning model. There is a class of interesting problems, nestled between trivial ones whose answers can simply be regurgitated and difficult ones that either yield nonsense or involve tool use, that transformer networks can absolutely, incontrovertibly reason about.

3 replies

SoftTalker • 4 days ago

Have any LLMs solved any of the big (or even lesser known) unanswered problems in math, physics, computer science?

It may appear that they are solving novel problems but given the size of their training set they have probably seen them. There are very few questions a person can come up with that haven't already been asked and answered somewhere.

1 reply

1bpp • 4 days ago

Google's AlphaEvolve recently produced a novel matrix multiplication function slightly faster than the previous state of the art that couldn't have been in any training data. While not a hard unsolved problem, I think it's good evidence that an LLM is capable of synthesizing new solutions to problems.

hyperadvanced • 4 days ago

Reason about: sure. Independently solve novel ones without extreme amounts of guidance: I have yet to see it.

Granted, for most language and programming tasks, you don’t need the latter, only the former.

1 reply

Workaccount2 • 4 days ago

99.9% of humans will never solve a novel problem. It's a bad benchmark to use here

2 replies

guappa • 4 days ago

But they will solve a problem novel to them, since they haven't read all of the text that exists.

hyperadvanced • 4 days ago

I agree. But it’s worth being somewhat skeptical of ASI scenarios if you can’t, for example, give a well formulated math problem to a LLM and it cannot solve it. Until we get a Reimann hypothesis calculator (or equivalent for hard/old unsolved maths) it’s kind of silly to be debating the extreme ends of AI cognition theory

1 reply

CamperBob2 • 4 days ago

"I'm taking this talking dog right back to the pound. It completely whiffed on both Riemann and Goldbach. And you should see the buffer overflows in the C++ code it wrote for me."

1 reply

hyperadvanced • 1 day ago

dog is a very different category man-made godlike super-intelligence

chillingeffect • 4 days ago

I have been able to get chatgpt to synthesize in the edges of two domains in ideaspace, say, psychology and economics, but surprisingly it struggled helping me write ODE code in go. In the first case, I think it actually synthesized. In the latter it couldn't extrapolate enough ideas from the two fields into one.

1 reply

majormajor • 4 days ago

How can you distinguish "I think it did something really impressive in the first case but not the second" from "it spat out something that looked interesting in both cases but in the latter case there was an objective criteria that exposed a lack of true understanding"?

It's famously easier to impress people with soft-sciences speculation than it is to impress the rules of math or compilers.

Workaccount2 • 4 days ago

I think people give training data too much credit. Obviously it's important, but it also isn't a database of knowledge like it's made out to be.

You can see this in riddles that are obviously in the training set, but older or lighter models still get them wrong. Or situations where the model gets them right, but uses a different method than the ones used in the training set.