I got 700+ tokens/sec on o3 after the announcement, I suspect it's very much a quantized version.
https://x.com/hyperknot/status/1932476190608036243
Or maybe they just brought online much faster much cheaper hardware.
Or they are using a speedy add-on decoder.
Do you also have numbers on intelligence before and after?
Is that input tokens or output tokens/s?