user568439 2 days ago

I just happen to have a 3 hours recording that needs transcription and I didn't manage with Whisper. It has 3 special characteristics:

-Huge size (400MB), it can be split but then I want a single text file with correct timestamps

- There are 3 speakers and one is speaking far from the microphone and with low voice. Whisper sometimes ignores this speaker.

- The last and more difficult is that there are 2 languages being used at the same time. The same speaker might use Dutch or English and even mix both in a sentence.

Is there a way to deal with all that?

1
lostmsu 2 days ago

Whisper 3 Large should be able to handle multiple languages in the same audio. Have you used that?