We're seeing diminishing returns in benchmark space, which is partly an artefact of construction, not an absolutely true commentary on how things are progressing.
Well yes but there is no better way to measure without resorting to pure hearsay. How would you make an accurate assessment of something so inherently vague?
Alter the benchmark space that we care about, for example focus only on ARC-AGI-2 and then suddenly the gains are no longer diminishing but are accelerating.