LLMs will still be this way 10 years from now.
But IDK if somebody won't create something new that gets better. But there is no reason at all to extrapolate our current AIs into something that solves programing. Whatever constraints that new thing will have will be completely unrelated to the current ones.
Stating this without any arguments is not very convincing.
Perhaps you remember that language models were completely useless at coding some years ago, and now they can do quite a lot of things, even if they are not perfect. That is progress, and that does give reason to extrapolate.
Unless of course you mean something very special with "solving programming".
> Perhaps you remember that language models were completely useless at coding some years ago, and now they can do quite a lot of things, even if they are not perfect.
IMO, they're still useless today, with the only progress being that they can produce a more convincing facade of usefulness. I wouldn't call that very meaningful progress.
I don't know how someone can legitimately say that they're useless. Perfect, no. But useless, also no.
> I don't know how someone can legitimately say that they're useless.
Clearly, statistical models trained on this HN thread would output that sequence of tokens with high probability. Are you suggesting that a statement being probable in a text corpus is not a legitimate source of truth? Can you generalize that a little bit?
I’ve found them somewhat useful? Not for big things, and not for code for work.
But for small personal projects? Yes, helpful.
It's funny how there's a decent % of people at both "LLMs are useless" and "LLMs 3-10x my productivity"
Why state the same arguments everybody has been repeating for ages?
LLMs can only give you code that somebody has wrote before. This is inherent. This is useful for a bunch of stuff, but that bunch won't change if OpenAI decides to spend the GDP of Germany training one instead of Costa Rica.
> LLMs can only give you code that somebody has wrote before. This is inherent.
This is trivial to prove to be false.
Invent a programming language that does not exist. Describe its semantics to an LLM. Ask it to write a program to solve a problem in that language. It will not always work, but it will work often enough to demonstrate that they are very much capable of writing code that has never been written before.
The first time I tried this was with GPT3.5, and I had it write code in an unholy combination of Ruby and INTERCAL, and it had no problems doing that.
Similarly giving it a grammar of a hypothetical language, and asking it to generate valid text in a language that has not existed before also works reasonably well.
This notion that LLMs only spit out things that has been written before might have been reasonable to believe a few years ago, but it hasn't been a reasonable position to hold for a long time at this point.
This doesn't surprise me, i find LLM's are really good at interpolating and translating. so if i made up a language and gave it the rules and asked it to translate i wouldn't expect it to be bad at it.
It shouldn't surprise anyone, but it is clear evidence against the claim I replied to, and clearly a lot of people still hold on to this irrational assumption that they can't produce anything new.
They're not producing anything new... If you give it the answer before asking the question, no wonder it can answer. Prompting is to find resonance in the patterns extracted from the training data, which is why it fails spectacularly for exotic programming languages.
When you invent a language and tell it express something in that language, you've not given it the answer before asking the question.
That's an utterly bizarre notion. The answer in question never existed before.
By your definition humans never produce anything new either, because we always also extrapolate on patterns from our previous knowledge.
> it fails spectacularly for exotic programming languages.
My experience is that it not just succeeds for "exotic" languages, but for languages that didn't exist prior to the prompt.
In other words, they can code at least simple programs even with zero-shot by explaining semantics of a language without giving them even a single example of programs in that language.
Did you even read the comment you replied to above?
To quote myself: "Invent a programming language that does not exist."
I've had this work both for "from scratch" descriptions of languages by providing grammars, and for "combine feature A from language X, and feature B from language Y". In the latter case you might have at least an argument. In the former case you do not.
Most humans struggle with tasks like this - you're setting a bar for LLMs most humans would fail to meet.
As long as you create the grammar, the language exists. Same if you edit a previous grammar. You're the one creating the language, not the model. It's just generating specific instance.
If you tell someone that multiplying a number by 2 is adding the number to itself, then if this person knows addition, you can't be surprised if it tells you that 9*2 is 18. A small leap in discovery is when the person can extract the pattern and gives you 5*3 is 5+5+5. A much bigger leap is when the person discovers exponent.
But if you take the time to explain each concept....
> As long as you create the grammar, the language exists.
Yes, but it didn't exist during training. Nothing in the training data would provide pre-existing content for the model to produce from, so the output would necessarily be new.
> But if you take the time to explain each concept....
Based on the argument you presented, nothing a human does is new, because it is all based on our pre-exististing learned rules of language, reasoning, and other subjects.
See the problem here? You're creating a bar for LLMs that nobody would reasonably assign to humans - not least because if you do, then "accusing" LLMs of the same does not distinguish them from humans in any way.
If that is the bar you wish to use, then for there to be any point to this discussion, you will need to give a definition of what it means to create something new that we can objectively measure that a human can meet that you believe an LLM can't even in theory meet, otherwise the goalpost will keep being moved when an LLM example can be shown to be possible.
See my definition at : https://news.ycombinator.com/item?id=44137201
As mentioned there, I was arguing that without being prompted, there's no way that it can add something that is not a combination of the training data. And that combination does not act on the same terms that you would expect someone learning the same material would do.
In Linear regression, you can reduce a big amount of data to a small amount of factors. Every prediction would be a combination of those factors. According to your definition, those prediction will be new. For me what's new is when you retrospectively adds the input to the training data, find a different set of factors that gives you a bigger set of possible answers (generation) or narrows the definition of correct answers (reliability).
That is what people do when programming a computer. You goes from something that can do almost anything and you restrict it down to a few things (that you need). What LLM do is throwing the dice and what you get may or may not do what you want, and may not even be possible.
That comment doesn't provide anything resembling a coherent definition.
The rest of what you wrote here is either also true for humans or not true for machines irrespective of your definitions unless you can demonstrate that humans can exceed the Turing computable.
You can not.
> LLMs can only give you code that somebody has wrote before.
This premise is false. It is fundamentally equivalent to the claim that a language model being trained on a dataset: ["ABA", "ABB"] would be unable to generate, given input "B" the string "BAB" or "BAA".
Isn't the claim, that it will never make up "C"?
They don't claim that. They say LLMs only generate text someone has written. Another way you could refute their premise was by showing the existence of AI-created programs for which someone isn't a valid description of the writer (e.g., from evolutionary algorithms) then training a network on that data such that it can output it. It is just as trivial a way to prove that the premise is false.
Your claim here is slightly different.
You're claiming that if a token isn't supported, it can't be output [1]. But we can easily disprove this by adding minimal support for all tokens, making C appear in theory. Such support addition shows up all the time in AI literature [2].
[1]: https://en.wikipedia.org/wiki/Support_(mathematics)
[2]: In some regimes, like game theoretic learning, support is baked into the solving algorithms explicitly during the learning stage. In others, like reinforcement learning, its accomplished by making the policy a function of two objectives, one an exploration objective, another an exploitation objective. That existing cross pollination already occurs between LLMs in the pre-trained unsupervised regime and LLMs in the post-training fine-tuning via forms of reinforcement learning regime should cause someone to hesitate to claim that such support addition is unreasonable if they are versed in ML literature.
Edit:
Got downvoted, so I figure maybe people don't understand. Here is the simple counterexample. Consider an evaluator that gives rewards: F("AAC") = 1, all other inputs = 0. Consider a tokenization that defines "A", "B", "C" as tokens, but a training dataset from which the letter C is excluded but the item "AAA" is present.
After training "AAA" exists in the output space of the language model, but "AAC" does not. Without support, without exploration, if you train the language model against the reinforcement learning reward model of F, you might get no ability to output "C", but with support, the sequence "AAC" can be generated and give a reward. Now actually do this. You get a new language model. Since "AAC" was rewarded, it is now a thing within the space of the LLM outputs. Yet it doesn't appear in the training dataset and there are many reward models F for which no person will ever have had to output the string "AAC" in order for the reward model to give a reward for it.
It follows that "C" can appear even though "C" does not appear in the training data.
I think it's not just token support, it's also having a understanding of certain concepts that allows you to arrive at new points like C, D, E, etc. But LLM's don't have an understanding of things, they are statistical models that predict what statistically is most likely following the input that you give it. But that that will always be based on already existing data that is fed into the model. It can produce "new" stuff only by combining the "old" stuff in new ways, but it can't "think" of something entirety conceptionally new, because it doesn't really "think".
> it can't "think" of something entirety conceptionally new, because it doesn't really "think".
Hierarchical optimization (fast global + slow local) is a precise, implementable notion of "thinking." Whenever I've seen this pattern implemented, humans, without being told to do so by others in some forced way, seem to converge on the use of verb think to describe the operation. I think you need to blacklist the term think and avoid using it altogether if you want to think clearly about this subject, because you are allowing confusion in your use of language to come between you and understanding the mathematical objects that are under discussion.
> It can produce "new" stuff only by combining the "old" stuff in new ways,
False premise; previously debunked. Here is a refutation for you anyway, but made more extreme. Instead of modeling the language task using a pre-training predictive dataset objective, only train on a provided reward model. Such a setup never technically shows "old" stuff to the AI, because the AI is never shown stuff explicitly. It just always generates new things and then the reward model judges how well it did. Clearly, the fact that it can do generation while knowing nothing, shows that your claim that it can never generate something new -- by definition everything would be new at this point -- is clearly false. Notice that as it continually generates new things and the judgements occur, it will learn concepts.
> But LLM's don't have an understanding of things, they are statistical models that predict what statistically is most likely following the input that you give it.
Try out Jayne's Probability Theory: The Logic Of Science. Within it the various underpinning assumptions that lead to probability theory are shown to be very reasonable and normal and obviously good. Stuff like represent plausibility with real numbers, keep rankings consistent and transitive, reduce to Boolean logic at certainty, and update so you never accept a Dutch-book sure-loss -- which together force the ordinary sum and product rules of probability. Then notice that statistics is in a certain sense just what happens when you apply the rules of probability.
> also having a understanding of certain concepts that allows you to arrive at new points like C, D, E, etc. But LLM's don't have an understanding of things
This is also false. Look into the line of research that tends to go by the name of Circuits. Its been found that models have spaces within their weights that do correspond with concepts. Probably you don't understand what concepts are -- that abstractions and concepts are basically forms of compression that let you treat different things as the same thing -- so a different way to arrive at knowing that this would be true is to consider a dataset with less parameters than there are items in the dataset and notice that the model must successfully compress the dataset in order to complete its objective.
Yes ok, it can generate new stuff, but it's dependent on human curated reward models to score the output to make it usable. So it still depends on human thinking, it's own "thinking" is not sufficient. And there won't be a point when human curated reward models are not needed anymore.
LLM's will make a lot of things easier for humans, because most of the thinking the humans do have been automated into the LLM. But ultimately you run into a limit where the human has to take over.
> dependent on human curated reward models to score the output to make it usable.
This is a false premise, because there already exist systems, currently deployed, which are not dependent on human-curated reward models.
Refutations of your point include existing systems which generate a reward model based on some learned AI scoring function, allowing self-bootstrapping toward higher and higher levels.
A different refutation of your point is the existing simulation contexts, for example, by R1, in which coding compilation is used as a reward signal; here the reward model comes from a simulator, not a human.
> So it still depends on human thinking
Since your premise was false your corollary does not follow from it.
> And there won't be a point when human curated reward models are not needed anymore.
This is just a repetition of your previously false statement, not a new one. You're probably becoming increasingly overconfident by restating falsehoods in different words, potentially giving the impression you've made a more substantive argument than you really have.
So to clarify, it could potentially come up with (something close to) C, but if you want it to get to D, E, F etc, it will become less and less accurate for each consequentive step, because it lacks the human curated reward models up to that point. Only if you create new reward models for C, the output for D will improve, and so on.
> Only if you create new reward models for C, the output for D will improve, and so on.
Again, tons of false claims. One is that 'you' have to create the reward model. Another that it has to be human-curated at all. Yet another is that you even need to do that at all: you can instead have the model build a bigger model of itself, train using its existing resources or more of them, then synthesize itself back down. Another way you can get around it is to augment the existing dataset in some way. No other changes except resource usage and yet the resulting model will be better, because more resources went into its construction.
Seriously notice: you keep making false claims again and again and again and again and again. You're not stating true things. You really need to reflect. If almost every sentence you speak on this topic is false, why is it that you think you should be able to persuade me to your views? Why should I believe your views, when you say so many things that are factually inaccurate, rather than my own views?
Ok, so you claim that LLMs can get smarter without human validation. So why do they hallucinate at all? And why are all reward models currently curated by humans? Or are you claiming they aren't?
I don't find it reasonable that you didn't understand my corrections, because current AI already do. So I'm exiting the conversation.
https://chatgpt.com/share/683a3c88-62a8-8008-92ef-df16ce2e8a...
Ok, this is interesting indeed and I'll investigate more into it. But I think my points still stand. Let me elaborate.
An LLM only learns through input text. It doesn't have a first-person 3D experience of the world. So it can't execute physical experiments, or even understand them. It can understand the texts about it, but it can't visualize it, because it doesn't have a visual experience.
And ultimately our physical world is governed by physical processes. So at the fundamentals of physical reality, the LLMs lack understanding. And therefore will stay dependent on humans educating and correcting it.
You might get pretty impressively far with all kinds of techniques, but you can't cross this barrier with just LLMs. If you want to, you have to give it senses like humans to give it an experience of the world, and make it understand these experiences. And sure they're already working on that, but that is a lot harder to create than a comprehensive machine learning algorithm.
You're doing this thing again where you say tons of things that aren't true.
> An LLM only learns through input text.
This is false. There already exist LLM which understand more than just text. Relevant search term: multi-modality.
> It doesn't have a first-person 3D experience of the world.
Again false. It is trivial to create such an experience with multi-modality. Just set up an input device which streams that.
> So it can't execute physical experiments, or even understand them.
Here you get confused again. It doesn't follow, based on perceptual modality, that someone can't do or understand experiments. Hellen Keller can be both blind, but also do an experiment.
Beyond just being confused, you also make another false claim. Current LLMs already have the capacity to run experiments and do so. Search terms: tool usage, ReAct loop, AI agents.
> It can understand the texts about it, but it can't visualize it, because it doesn't have a visual experience.
Again, false!
Multi-modal LLMs currently possess the ability to generate images.
> And ultimately our physical world is governed by physical processes. So at the fundamentals of physical reality, the LLMs lack understanding. And therefore will stay dependent on humans educating and correcting it.
Again false. The same sort of reasoning would claim that Hellen Keller couldn't read a book, but braille exists. The ability to acquire information outside an umwelt is a capability that intelligence enables.
You come up with very interesting points, and I'm thankful for that. But I also think you're missing the crux or my message. LLMs don't experience the world the same way humans do. And they also don't think in the same way. So you can train them very far with enough input data, but there will always be a limit of what they can understand compared to a human. If you want them to think and experience the world in the same way, you basically have to create a complete human.
My example about the visualization was just an example to prove a point. What I ultimately mean is the whole complete human experience. And besides, if you give it eyes, what data are you gonna train it on? Most videos on the internet are filmed with one lens, which doesn't give you a 3D visual. So you would have to train it like a baby growing up, trial on error. And then again we're talking only about the visual.
Hellen Keller wasn't born blind, so she did have a chance to develop her visual brain functions. Most people can visualize things with their eyes closed.
Chess engines cannot see like a human can. When they think they don't necessarily think using the exact same method that a human uses. Yet train a chess engine for a very long time and it can actually end up understanding chess better than a human can.
I do understand the points you are attempting to make. The reason you're failing to prove your point is not because I am failing to understand the thrust of what you were trying to argue.
Imagine you were talking to someone who was a rocket scientist, and you were talking to them about engines and you had an understanding of engines that was predicated on your experience with cars. You start making claims about the nature of engines and they disagree with you they argue with you and they point out all these ways that you're wrong. Is this person going to be doing this because they're not able to understand your points? Or is it more likely that their experience with engines that are different than the engines that you're used to give them a different perspective that forced them to think of the world in a different way than you do?
Well chess has a very limited set of rules and playing field. And the way to win in chess is to be able to think forward, how all the moves could play out, and pick the best one. This is relatively easy to create an algorithm for that surpasses humans. That is what computers are good at: executing specific algorithms very fast. A computer will always beat a human to that.
So such algorithms can replace certain functions of humans, but they can't replace the human as a whole. And that is the same with LLMs. They save us time for repetative tasks, but they can't replace all of our functions. In the end an LLM is a comprehensive algorithm constantly updated with machine learning. It's very helpful, but it has its limits. The limit is constantly surpassed, but it will never replace a full human. To do that you need to do a whole lot more than a comprehensive machine learning algorithm. They can get very close to something that looks like a human, but there will always be something lacking. Which then again can be improved upon, but you never reach the same level.
That is why I don't worry about AI taking our jobs. They replace certain functions, which will make our job easier. I don't see myself as a coder, I see myself as a system designer. I don't mind if AIs take over (certain parts of) the coding process (once they're good enough). It will just make software development easier and faster. I don't think there will be less demand for software developers.
It will change our jobs, and we'll have to adapt to that. But that is always what happens with new technology. You have to grow along with the changes and not expect that you can keep doing the same thing for the same value. But I think that for most software developers that isn't news. In the old days people were programming in assembly, then compiled languages came and then higher level languages. Now we have LLMs, which (when they become good enough) will just be another layer of abstraction.
> And there won't be a point when human curated reward models are not needed anymore.
This doesn't follow at all. There's no reason why a model can not be made to produce reward models.
But reward models are always curated by humans. If you generate a reward model with an LLM, it will contain hallucinations that need to be corrected by humans. But that is what a reward model is for. To correct the hallucinations of LLMs.
So yeah theoretically you could generate reward models with LLMs, but they won't be any good, unless they are curated by other reward models that are ultimately curated by humans.
> But reward models are always curated by humans.
There is no inherent reason why they need to be.
> So yeah theoretically you could generate reward models with LLMs, but they won't be any good, unless they are curated by other reward models that are ultimately curated by humans.
This reasoning is begging the question: The reasoning is true only if the conclusion is true. It's therefore a logically invalid argument.
There is no inherent reason why this needs to be the case.
Sorry but I don't follow your logic. Are you claiming that reward models that aren't curated by humans perform as well as ones that are?
Then what is a reward model's function according to you?
I'm claiming exactly what I wrote: That there is no inherent reason why a human curated one needs to be better.
In reinforcement learning and related fields, a _reward model_ is a function that assigns a scalar value (a reward) to a given state, representing how desirable it is. You're at liberty to have compound states: for an example, a trajectory (often called tau) or a state action pair (typically represented by s and a).
But doesn't reward for "**C" means that "C" is in the training data?
I am not sure if that is an accurate model, but if you think of it as a vectorspace, sure you can generate a lot of vectors from some set of basevectors, but you can never generate a new basevector from others, since they are linearly independent, so there are a bunch of new vectors you can never generate.
For an example of a reward model that doesn't include "C" explicitly consider a reward model defined to be the count of the one bits in letters in the input. It would define a reward for "C" but "C" doesn't show up explicitly, because the reward had universal reach and "C" was among its members as a result.
> But doesn't reward for "*C" means that "C" is in the training data?
You're running into an issue here due to overloading terms. Training data has three different meanings in this conversation depending on which context you are in.
1. The first is the pre-training context in which we're provided a dataset. My words were appropriate in that context.
2. The second is the reinforcement learning setup context in which we don't provide any dataset, but instead provide a reward model. My words were appropriate in that context.
3. The final context is that during the reinforcement learning algorithms operation one of things it does is generate datasets and then learn from them. Here, its true that there exists a dataset in which "C" is defined.
Recall that the important aspect of this discussion has to do with data provenance. We led off with someone claiming that an analog of "C" wasn't provided in the training data by a human explicitly. That means that I only need to establish that "C" doesn't show up in either of the inputs to a learning algorithm. That is case one and that is case two. It is not case three, because upon entering case three the provenance is no longer from humans.
Therefore, the answer to the question but doesn't the reward model for C mean that C is in the training data has the answer: no, it doesn't, because although it appears in case three, it doesn't appear in case one or case two and those were the two cases which were relevant to the question. That is appears in case three is just the mechanism by which the refutation that it could not appear occurs.
> I am not sure if that is an accurate model, but if you think of it as a vectorspace, sure you can generate a lot of vectors from some set of basevectors, but you can never generate a new basevector from others, since they are linearly independent, so there are a bunch of new vectors you can never generate.
Your model of vectors sounds right to me, but your intuitions about it are a little bit off in places.
In machine learning, we introduce non-linearities during training (for example, through activation functions like ReLU or Sigmoid). This breaks the strict linear structure of the model, enabling it to approximate a much wider range of functions. There's a mathematical proof (known as the Universal Approximation Theorem) that shows how this non-linearity allows neural networks to represent virtually any continuous function, regardless of its complexity.
We're not really talking about datasets when we move into a discussion about this. Its closer to a discussion of inductive biases. Inductive bias refers to the assumptions a model makes about the underlying structure, which guide it toward certain types of solutions. If something doesn't map to the structure the inductive bias assumes, it can be possible for the model to be incapable of learning that function successfully.
The last generation of popular architectures used convolutional networks quite often. These baked in an inductive bias about where data that was related to other data was and so made learning some functions difficult or impossible when those assumptions were violated. The current generation of models tends to be built on transformers. Transformers use an attention mechanism that can determine what data to focus on and as a result they are more capable of avoiding the problems that bad inductive bias can create since they can end up figuring out what they are supposed to be paying attention to.
First, how much of coding is really never done before?
And secondly, what you say are false (at least if taken literally). I can create a new programming language, give the definition of it in the prompt, ask it to code something in my language, and expect something out. It might even work.
> I can create a new programming language, give the definition of it in the prompt, ask it to code something in my language, and expect something out. It might even work.
I literally just pointed out the same time without having seen your comment.
Second this. I've done this several times, and it can handle it well. Already GPT3.5 could easily reason about hypothetical languages given a grammar or a loose description.
I find it absolutely bizarre that people still hold on to this notion that these languages can't do anything new, because it feels implausible that they have tried given how well it works.
If you give it the rules to generate something, why can't it generate it? That's what something like Mockaroo[0] does. It's just more formal. That's pretty much what LLM training does, extracting patterns from a huge corpus of text. Then it goes one to generate according to the patterns. It can not generate a new pattern that is not a combination of the previous one.
> If you give it the rules to generate something, why can't it generate it?
It can, but that does not mean that what is generate is not new, unless the rules in question constrains the set to the point where onely one outcome is possible.
If I tell you that a novel has a minimum of 40,000 words, it does not mean that no novel is, well, novel (not sorry), just because I've given you rules to stay within. Any novel will in some sense be "derived from" an adherence to those rules, and yet plenty of those novels are still new.
The point was that by describing a new language in a zero-shot manner, you ensure that no program in that language exists either in the training data or in the prompt, so what it generates must at a minimum be new in the sense that it is in a language that has not previously existed.
If you then further gives instructions for a program that incorporates constraints that are unlikely to have been used before (but this is harder) you can further ensure the novelty of the output along other axes.
You can keep adding arbitrary conditions like this, and LLMs will continue to produce output. Human creative endeavour is often similarly constrained to rules: Rules for formats, rules for competitions, rules for publications, and yet nobody would suggest this means that the output isn't new or creative, or suggest that the work is somehow derivative of the rules.
This notion is setting a bar for LLMs we don't set for humans.
> That's pretty much what LLM training does, extracting patterns from a huge corpus of text. Then it goes one to generate according to the patterns.
But when you describe a new pattern as part of the prompt, the LLM is not being trained on that pattern. It's generating on the basis of interpreting that what it is told in terms of the concepts it has learned, and developing something new from it, just as a human working within a set of rules is not creating merely derivative works just because we have past knowledge and have been given a set of rules to work to.
> It can not generate a new pattern that is not a combination of the previous one.
The entire point of my comment was that this is demonstrably false unless you are talking strictly in the sense of a deterministic view of the universe where everything including everything humans do is a combination of what came before. In which case the discussion is meaningless.
Specific models can be better or worse at it, but unless you can show that humans somehow exceed the Turing computable there isn't even a plausible mechanism for how humans could even theoretically be able to produce anything so much more novel that it'd be impossible for LLMs to produce something equally novel.
I was referring as new as some orthogonal dimension in the same space. If we're referring to your definition, any slight changes in the parameters results in something new. I was arguing more about if the model knows about axes x and y, then it's output is constrained to a plane unless you add z. But more often than not it's output will be a cylinder (extruded from a circle in the x,y plane) instead of a sphere.
The same thing goes for image generation. Every picture is new, but it's a combination of the pictures it founds. It does not learn about things like perspectives, values, forms, anatomy,... the way an artist does which are the proper dimensions of drawing.
> that humans somehow exceed the Turing computable
Already done by Gödel's incompleteness theorems[0] and the halting problem[1]. Meaning that we can do some stuff that no algorithm can do.
[0]: https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_...
You completely fail to understand Gödel's incompleteness theorems and the halting problem if you think they are evidence of something humans can do that machines can not. It makes the discussion rather pointless if you lack that fundamental understanding of the subject.
Second, how much of commenting is really never done before?
good question. why isn't the gp using llm to generate comments then.
For some types of comment, it really would be tempting to automate the answers, because especially the "stochastic parrot" type comments are getting really tedious and inane, and ironically comes across as people parroting the same thing over and over instead of thinking.
But the other answer is that often the value in responding is to sharpen the mind and be forced to think through and formulate a response even if you've responded to some variation of the comment you reply to many times over.
A lot of comments that don't give me any value to read are comments I still get value out of through the process of replying to for that reason.
> how much of coding is really never done before?
A lot because we use libraries for 'done frequently before' code. i don't generate a database driver for my webapp with llm.
We use libraries for SOME of the 'done frequently' code.
But how much of enterprise programming is 'get some data from a database, show it on a Web page (or gui), store some data in the database', with variants?
It makes sense that we have libraries for abstraction away some common things. But it also makes sense that we can't abstract away everything we do multiple times, because at some point it just becomes so abstract that it's easier to write it yourself than to try to configure some library. Does not mean that it's not a variant of something done before.
> we can't abstract away everything we do multiple times
I think there's a fundamental truth about any code that's written which is that it exists on some level of specificity, or to put it in other words, a set of decisions have been made about _how_ something should work (in the space of what _could_ work) while some decisions have been left open to the user.
Every library that is used is essentially this. Database driver? Underlying I/O decisions are probably abstracted away already (think Netty vs Mina), and decisions on how to manage connections, protocol handling, bind variables, etc. are made by the library, while questions remain for things like which specific tables and columns should be referenced. This makes the library reusable for this task as long as you're fine with the underlying decisions.
Once you get to the question of _which specific data is shown on a page_ the decisions are closer to the human side of how we've arbitrarily chosen to organise things in this specific thousandth-iteration of an e-commerce application.
The devil is in the details (even if you know the insides of the devil aren't really any different).
> Once you get to the question of _which specific data is shown on a page_ the decisions are closer to the human side of how we've arbitrarily chosen to organise things in this specific thousandth-iteration of an e-commerce application.
That's why communication is so important, because the requirements are the primary decision factors. A secondary factors is prior technical decisions.
> it's easier to write it yourself than to try to configure some library
yeah unfortunately LLM will make this worse. Why abstract when you can generate.
I am already seeing this a lot at work :(
Cue Haskell gang "Design patterns are workarounds for weaknesses in your language".
> First, how much of coding is really never done before?
Lots of programming doesn't have one specific right answer, but a bunch of possible right answers with different trade-offs. The programmers job isn't just to get working code neccesarily. I dont think we are at the point where llm's can see the forest for the trees, so to speak.
That’s not true. LLMs are great translators, they can translate ideas to code. And that doesn’t mean it has to be recalling previously seen text.
Generating unseen code is not hard.
Set rules on what’s valid, which most languages already do; omit generation of known code; generate everything else
The computer does the work, programmers don’t have to think it up.
A typed language example to explain; generate valid func sigs
func f(int1, int2) return int{}
If that’s our only func sig in our starting set then it makes it obvious
Well relative to our tiny starter set func f(int1, int2, int3) return int{} is novel
This Redis post is about fixing a prior decision of a random programmer. A linguistics decision.
That’s why LLMs seem worse than programmers because we make linguistics decisions that fit social idioms.
If we just want to generate all the never before seen in this model code we don’t need a programmer. If we need to abide laws of a flexible language nature, that’s what a programmer is for; compose not just code by compliance with ground truth.
That antirez is good at Redis is a bias since he has context unseen by the LLM. Curious how well antirez would do with an entirely machine generated Redis-clone that was merely guided by experts. Would his intuition for Redis’ implementation be useful to a completely unknown implementation?
He’d make a lot of newb errors and need mentorship, I’m guessing.
I think we're hoping for more than the 'infinite monkeys bashing out semantically correct code' approach.
Ok, define what means and make it. Then as soon as you do realize you run into Gödel’s understanding your machine doesn’t solve problems related to its own existence and needs outside help. So you need to generate that yet unseen solution that lacks context for understanding itself… repeat and see it’s exactly generating one yet unseen layer of logic after another.
Read the article; his younger self failed to see logic needed now. Add that onion peel. No such thing as perfect clairvoyance.
Even Yann LeCun’s energy based models driving robots have the same experience problem.
Make a computer that can observe all of the past and future.
Without perfect knowledge our robots will fail to predict some composition of space time before they can adapt.
So there’s no probe we can launch that’s forever and generally able to survive with our best guess when launched.
More people need to study physical experiments and physics and not the semantic rigor of academia. No matter how many ideas we imagine there is no violating physics.
Pop culture seems to have people feeling starship Enterprise is just about to launch from dry dock.
Progress sure, but the rate the’ve improved hasn’t been particularly fast recently.
Programming has become vastly more efficient in terms of programmer effort over decades, but making some aspects of the job more efficient just means all your effort it spent on what didn’t improve.
People seem to have forgotten how good the 2023 GPT-4 really was at coding tasks.
The latest batch of LLMs has been getting worse in my opinion. Claude in particular seems to be going backwards with every release. The verbosity of the answers is infuriating. You ask it a simple question and it starts by inventing the universe, poorly
> Perhaps you remember that language models were completely useless at coding some years ago
no i don't remember that. They are doing similar things now that they did 3 yrs ago. They were still a decent rubber duck 3 yrs ago.
And 6 years ago GPT2 had just been released. You're being obtuse by interpreting "some years" as specifically 3.