suddenlybananas 4 days ago

LLMs require vastly more data than humans and still struggle with some more esoteric grammatical rules like parasitic gaps. The fact grammar can be approximated given trillions of words doesn't explain how babies learn language from a much more modest dataset.

2
numpad0 4 days ago

I think it does. I think LLM showed us possibility that maybe there's no language but just pile of memes and supplemental compression scheme that is grammar.

LLM had really destroyed Chomsky's positions in multiple different ways: nothing perform even close to LLM in language generation, yet it didn't grow a UG for natural languages, while it did develop a shared logic for non-natural languages and abstract concepts, while dataset needing to be heavily English biased to be English fluent, and parameter count needing to be truly massive as multiple hundred billion parameters large, so on and on.

Those are all circumstantial evidences at best, a random paraphernalia of statements that aren't even appropriate to bring into discussions, all meaningless - in the sense that an open hand of a person observing another individual aligned to a line between standing position of the person to the center of nearest opening of a wall would be meaningless.

suddenlybananas 4 days ago

>LLM had really destroyed Chomsky's positions in multiple different ways: nothing perform even close to LLM in language generation, yet it didn't grow a UG for natural languages

Do you even understand Chomsky's position?

numpad0 3 days ago

To be honest, I don't, at least not entirely. Noam Chomsky to me is patron saint of compilers and apparent sources of quotes used to justify eye-rolling decisions regarding i18n. At least a lot of his followers' understanding is that the UG is THE UG and a Universal Syntax, and/or is a decisive and scientific refutation of Sapir-Whorf hypothesis as well as European structuralism, not whatever his later works on UG that progressively pivoted its definition or nature vs nurture debates were "meant" to be discussing.

To me this text look like his Baghdad Bob moment. Silly but right and noble. What else is it?

Ironically these days you can just throw this text at ChatGPT to have it debloat or critique text like this transcripts. Worse results than taking time reading yourself, but gives you validation if that is what is needed.

thomassmith65 4 days ago

It's not that the invention of LLMs conclusively disproves Chomksy's position.

However, we now have a proof-of-concept that a computer can learn grammar in a sophisticated way, from the ground up.

We have yet to code something procedural that approaches the same calibre via a hard-coded universal grammar.

That may not obliterate Chomksy's position, but it looks bad.

suddenlybananas 4 days ago

That's not the goal of generative linguistics though; it's not an engineering project.

thomassmith65 4 days ago

The problem encompasses not just biology and information technology, but also linguistics. Even if LLMs say nothing about biology, they do tell us something about the nature of language itself.

Again, that LLMs can learn to compose sophisticated texts from training alone does not close the case on Chomsky's position.

However, it is a piece of evidence against it. It does suggest, by Occam's razor, that a hardwired universal grammar is the lesser theory.

suddenlybananas 4 days ago

How do LLMs explain how 5 year olds respect island constraints?

thomassmith65 4 days ago

I don't have the domain knowledge to discuss that.

suddenlybananas 4 days ago

If you don't know what a syntactic island is, perhaps you're not the best judge of the plausibility of a linguistic theory.

thomassmith65 4 days ago

Fantastic, let's have a debate about me /s