This matches my experience, but cant explain it. Do you know what's going on?
My understanding is context size. Companies like Cursor are trying to minimize the amount of context sent to the models to keep their own costs down. Claude Code seems to send a lot more context with every request and that seems to make the difference.
Just guessing, but the new Opus was probably RL tuned to work better with Claude Code's tool calls