MacWhisper [+]
Jul 15 2024
MacWhisper is a transcription app that can run standalone or in the cloud and it uses OpenAI's tech. I bought the Pro version (39 Euros) -- the free version you can only use a smaller language set (which still worked great when I tried it).
Set your default settings: Language (auto or specific language), Model (small, 500 MB; medium, 1.5 GB, large, 3.0 GB). Drag an audio or video file onto app window and it starts translating, eventually giving you an estimated time to completion. Since I bought it I use medium (free only has small) which translates at best x0.8 speed (i.e. a 4 hour file takes 5 hours) but it's more like x0.5 speed or slower if I'm doing light stuff on my MBP (last Intel version). AI translation is really designed to use an M1 or better CPU as they have dedicated neural engines. (Small model is about twice as fast so if you want speed use that in which case you don't really need the pay version.)
Once it's done you can export a raw text or a segmented text (with timestamps) or a MacWhisper file which I guess makes it easier to just load it and re-export without another transcription phase. There's also a capability of assigning speaker names, though I haven't tried it.
There's a batch mode though you have to set how it saves all transcription outputs (i.e. I can run a batch and have all transcriptions saved as raw text). One thing it lacks is ability to pause which I'd find convenient since when it's running it's at 850% CPU (Handbrake uses 1200% CPU though that does have a pause function).
I use this to transcript Twitch VODs. Accuracy is pretty good. It can detect music and put [music] or even [high energy music] and it seems pretty good about filtering out background music. It can detect languages and put [Polish speech] and I assume would translate it language is set to Polish.
The app i think is a one-person developer and it's in active development with multiple updates per month. I think without an M1 or better Mac it's too slow to be useful unless you're patient.
I also tried Aiko which also uses OpenAI tech though not sure which language model size -- maybe medium based on it's speed. Aiko has almost no customization options though on the other hand it's free and available on the Mac App Store.
Set your default settings: Language (auto or specific language), Model (small, 500 MB; medium, 1.5 GB, large, 3.0 GB). Drag an audio or video file onto app window and it starts translating, eventually giving you an estimated time to completion. Since I bought it I use medium (free only has small) which translates at best x0.8 speed (i.e. a 4 hour file takes 5 hours) but it's more like x0.5 speed or slower if I'm doing light stuff on my MBP (last Intel version). AI translation is really designed to use an M1 or better CPU as they have dedicated neural engines. (Small model is about twice as fast so if you want speed use that in which case you don't really need the pay version.)
Once it's done you can export a raw text or a segmented text (with timestamps) or a MacWhisper file which I guess makes it easier to just load it and re-export without another transcription phase. There's also a capability of assigning speaker names, though I haven't tried it.
There's a batch mode though you have to set how it saves all transcription outputs (i.e. I can run a batch and have all transcriptions saved as raw text). One thing it lacks is ability to pause which I'd find convenient since when it's running it's at 850% CPU (Handbrake uses 1200% CPU though that does have a pause function).
I use this to transcript Twitch VODs. Accuracy is pretty good. It can detect music and put [music] or even [high energy music] and it seems pretty good about filtering out background music. It can detect languages and put [Polish speech] and I assume would translate it language is set to Polish.
The app i think is a one-person developer and it's in active development with multiple updates per month. I think without an M1 or better Mac it's too slow to be useful unless you're patient.
I also tried Aiko which also uses OpenAI tech though not sure which language model size -- maybe medium based on it's speed. Aiko has almost no customization options though on the other hand it's free and available on the Mac App Store.