First, how much of coding is really never done before?
And secondly, what you say are false (at least if taken literally). I can create a new programming language, give the definition of it in the prompt, ask it to code something in my language, and expect something out. It might even work.
> I can create a new programming language, give the definition of it in the prompt, ask it to code something in my language, and expect something out. It might even work.
I literally just pointed out the same time without having seen your comment.
Second this. I've done this several times, and it can handle it well. Already GPT3.5 could easily reason about hypothetical languages given a grammar or a loose description.
I find it absolutely bizarre that people still hold on to this notion that these languages can't do anything new, because it feels implausible that they have tried given how well it works.
If you give it the rules to generate something, why can't it generate it? That's what something like Mockaroo[0] does. It's just more formal. That's pretty much what LLM training does, extracting patterns from a huge corpus of text. Then it goes one to generate according to the patterns. It can not generate a new pattern that is not a combination of the previous one.
> If you give it the rules to generate something, why can't it generate it?
It can, but that does not mean that what is generate is not new, unless the rules in question constrains the set to the point where onely one outcome is possible.
If I tell you that a novel has a minimum of 40,000 words, it does not mean that no novel is, well, novel (not sorry), just because I've given you rules to stay within. Any novel will in some sense be "derived from" an adherence to those rules, and yet plenty of those novels are still new.
The point was that by describing a new language in a zero-shot manner, you ensure that no program in that language exists either in the training data or in the prompt, so what it generates must at a minimum be new in the sense that it is in a language that has not previously existed.
If you then further gives instructions for a program that incorporates constraints that are unlikely to have been used before (but this is harder) you can further ensure the novelty of the output along other axes.
You can keep adding arbitrary conditions like this, and LLMs will continue to produce output. Human creative endeavour is often similarly constrained to rules: Rules for formats, rules for competitions, rules for publications, and yet nobody would suggest this means that the output isn't new or creative, or suggest that the work is somehow derivative of the rules.
This notion is setting a bar for LLMs we don't set for humans.
> That's pretty much what LLM training does, extracting patterns from a huge corpus of text. Then it goes one to generate according to the patterns.
But when you describe a new pattern as part of the prompt, the LLM is not being trained on that pattern. It's generating on the basis of interpreting that what it is told in terms of the concepts it has learned, and developing something new from it, just as a human working within a set of rules is not creating merely derivative works just because we have past knowledge and have been given a set of rules to work to.
> It can not generate a new pattern that is not a combination of the previous one.
The entire point of my comment was that this is demonstrably false unless you are talking strictly in the sense of a deterministic view of the universe where everything including everything humans do is a combination of what came before. In which case the discussion is meaningless.
Specific models can be better or worse at it, but unless you can show that humans somehow exceed the Turing computable there isn't even a plausible mechanism for how humans could even theoretically be able to produce anything so much more novel that it'd be impossible for LLMs to produce something equally novel.
I was referring as new as some orthogonal dimension in the same space. If we're referring to your definition, any slight changes in the parameters results in something new. I was arguing more about if the model knows about axes x and y, then it's output is constrained to a plane unless you add z. But more often than not it's output will be a cylinder (extruded from a circle in the x,y plane) instead of a sphere.
The same thing goes for image generation. Every picture is new, but it's a combination of the pictures it founds. It does not learn about things like perspectives, values, forms, anatomy,... the way an artist does which are the proper dimensions of drawing.
> that humans somehow exceed the Turing computable
Already done by Gödel's incompleteness theorems[0] and the halting problem[1]. Meaning that we can do some stuff that no algorithm can do.
[0]: https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_...
You completely fail to understand Gödel's incompleteness theorems and the halting problem if you think they are evidence of something humans can do that machines can not. It makes the discussion rather pointless if you lack that fundamental understanding of the subject.
Second, how much of commenting is really never done before?
good question. why isn't the gp using llm to generate comments then.
For some types of comment, it really would be tempting to automate the answers, because especially the "stochastic parrot" type comments are getting really tedious and inane, and ironically comes across as people parroting the same thing over and over instead of thinking.
But the other answer is that often the value in responding is to sharpen the mind and be forced to think through and formulate a response even if you've responded to some variation of the comment you reply to many times over.
A lot of comments that don't give me any value to read are comments I still get value out of through the process of replying to for that reason.
> how much of coding is really never done before?
A lot because we use libraries for 'done frequently before' code. i don't generate a database driver for my webapp with llm.
We use libraries for SOME of the 'done frequently' code.
But how much of enterprise programming is 'get some data from a database, show it on a Web page (or gui), store some data in the database', with variants?
It makes sense that we have libraries for abstraction away some common things. But it also makes sense that we can't abstract away everything we do multiple times, because at some point it just becomes so abstract that it's easier to write it yourself than to try to configure some library. Does not mean that it's not a variant of something done before.
> we can't abstract away everything we do multiple times
I think there's a fundamental truth about any code that's written which is that it exists on some level of specificity, or to put it in other words, a set of decisions have been made about _how_ something should work (in the space of what _could_ work) while some decisions have been left open to the user.
Every library that is used is essentially this. Database driver? Underlying I/O decisions are probably abstracted away already (think Netty vs Mina), and decisions on how to manage connections, protocol handling, bind variables, etc. are made by the library, while questions remain for things like which specific tables and columns should be referenced. This makes the library reusable for this task as long as you're fine with the underlying decisions.
Once you get to the question of _which specific data is shown on a page_ the decisions are closer to the human side of how we've arbitrarily chosen to organise things in this specific thousandth-iteration of an e-commerce application.
The devil is in the details (even if you know the insides of the devil aren't really any different).
> Once you get to the question of _which specific data is shown on a page_ the decisions are closer to the human side of how we've arbitrarily chosen to organise things in this specific thousandth-iteration of an e-commerce application.
That's why communication is so important, because the requirements are the primary decision factors. A secondary factors is prior technical decisions.
> it's easier to write it yourself than to try to configure some library
yeah unfortunately LLM will make this worse. Why abstract when you can generate.
I am already seeing this a lot at work :(
Cue Haskell gang "Design patterns are workarounds for weaknesses in your language".
> First, how much of coding is really never done before?
Lots of programming doesn't have one specific right answer, but a bunch of possible right answers with different trade-offs. The programmers job isn't just to get working code neccesarily. I dont think we are at the point where llm's can see the forest for the trees, so to speak.