Whisper large v3 from openai, but we host it ourselves on Modal.com. It's easy, fast, no rate limits, and cheap as well.
If you want to run it locally, I'd still go with whisper, then I'd look at something like whisper.cpp https://github.com/ggml-org/whisper.cpp. Runs quite well.
I second this (whisper.cpp). I've had a good experience running whisper.cpp locally. I wrote a Python wrapper for invoking its whisper-cli: https://github.com/pramodbiligiri/annotate-subs/blob/main/ge... (that repo's readme might have more details).
Mind you, this is from a few months back! Not sure if this is still the best approach ¯\_(ツ)_/¯