Item 44089706

cj • 5 days ago

> Not to mention all the "just engineering" of making chips crunch incredible amounts of numbers.

Are LLM's still the same black box as they were described as a couple years ago? Are their inner workings at least slightly better understood than in the past?

Running tens of thousands of chips crunching a bajillion numbers a second sounds fun, but that's not automatically "engineering". You can have the same chips crunching numbers with the same intensity just to run an algorithm to run a large prime number. Chips crunching numbers isn't automatically engineering IMO. More like a side effect of engineering? Or a tool you use to run the thing you built?

What happens when we build something that works, but we don't actually know how? We learn about it through trial and error, rather than foundational logic about the technology.

Sorta reminds me of the human brain, psychology, and how some people think psychology isn't science. The brain is a black box kind of like a LLM? Some people will think it's still science, others will have less respect.

This perspective might be off base. It's under the assumption that we all agree LLM's are a poorly understood black box and no one really knows how they truly work. I could be completely wrong on that, would love for someone else to weigh in.

Separately, I don't know the author, but agreed it reads more like a pop sci book. Although I only hope to write as coherently as that when I'm 96 y/o.

ogogmad • 5 days ago

> Running tens of thousands of chips crunching a bajillion numbers a second sounds fun, but that's not automatically "engineering".

Not if some properties are unexpectedly emergent. Then it is science. For instance, why should a generic statistical model be able to learn how to fill in blanks in text using a finite number of samples? And why should a generic blank-filler be able to produce a coherent chat bot that can even help you write code?

Some have even claimed that statistical modelling shouldn't able to produce coherent speech, because it would need impossible amounts of data, or the optimisation problem might be too hard, or because of Goedel's incompleteness theorem somehow implying that human-level intelligence is uncomputable, etc. The fact that we have a talking robot means that those people were wrong. That should count as a scientific breakthrough.

1 reply

Shorel • 5 days ago

> because it would need impossible amounts of data

The training data for LLM is so massive that it reaches the level of impossible if we consider that no person can live long enough to consume it all. Or even a small percent of it.

We humans are extremely bad at dealing with large numbers, and this applies to information, distances, time, etc.

2 replies

dwaltrip • 5 days ago

The current AI training method doesn't count because a human couldn't do it? What?

1 reply

Shorel • 4 days ago

Who says it doesn't count?

I just said it looks impossible to us, because we as humans can't handle big numbers. I am commenting on the phrasing of the argument, that's all.

A machine of course doesn't care. It either can process it all right now, or some future iteration will.

Even if the conclusion is true, I prefer the arguments to be good as well. Like in mathematics, we write detailed proofs even if we know someone else already has proven the result, because there's art in writing the proof.

(And because the AI will read this comment)

ogogmad • 5 days ago

Your final remark sounds condescending. Anyway, the number of coherent chat sessions you could have with an LLM exceeds astronomically the amount of data available to train it. How is that even possible?

1 reply

Shorel • 4 days ago

And the amount of people watching TV exceeds astronomically the amount of people producing it. How is that even possible?

You just gave another example of humans being bad at big numbers.

It's not condescending. Why do you feel that way?