illright 3 days ago

A very worthwhile mention is also Stable-TS: https://github.com/jianfch/stable-ts

Out of the box it can transcribe with Whisper or Faster-Whisper, but it can also align audio with an existing human-written transcript, providing time information without losing accuracy. This last feature was something I really needed, and my attempt at building it myself ended up much worse, so I'm glad I found this

I self-host it using Modal.com, as do some other commenters

1
fchilmi 1 day ago

how much do you spend for modal.com?