Anyone else feel like this is a best case application for LLMs?
You could in theory automate the entire process, treat the LLM as a very advanced fuzzer. Run it against your target in one or more VMs. If the VM crashes or otherwise exhibits anomalous behavior, you've found something. (Most exploits like this will crash the machine initially, before you refine them.)
On one hand: great application for LLMs.
On the other hand: conversely implies that demonstrating this doesn't mean that much.
I mean, yes, they're doing it, but my question was really whether people share my belief that it's a particularly well-fitting application ;)
(Also yeah feels like the "FIRST!!1!eleven" thing metastasized from comment sections into C-level executives…)
That seems to be really preoccupied with who was first, without looking at the magnitude of the results, which is far from "meh."
I think it was more a PoC. I would be more impressed if it was deployed in production. "we want to reiterate that these are highly experimental results". If the dividends are massive, would they not deploy it in production and tell the world about it?