Noteable:
> o3 finds the kerberos authentication vulnerability in 8 of the 100 runs
And I'd guess this only became a blog post because the author already knew about the vuln and was just curious to see if the intern could spot it too, given a curated subset of the codebase
He did do exactly what you say – except right after that, while reviewing the outputs, he found that it had also discovered a different 0day.
Now the question is whether spending same time to analyze that bit of code instead of throwing automated intern at it would be time spent better
The time they didn't spend reading the 13k LOCs themselves would've been time spent better.
What?