Five different AI options for transcribing audio
Rev and Sonix compete with Otter.ai by providing new features and multi-language transcription for journalists
For many, Otter.ai has been a loyal friend for years. The popular transcription platform has saved us loads of time by transforming recorded interviews into transcripts that are editable, searchable and shareable. But with the departure of their unrestricted free version, journalists have realized that Otter has limitations that make it worth it to look around at other emerging options. Otter still cannot transcribe non-English languages, and the paid levels are not always an affordable choice for smaller newsrooms or freelancers.
While we could not test every AI transcription service out there, we chose five options that we found highly recommended and/or free alternatives with unique tools that could be helpful in day-to-day reporting.
We partnered with KBIA to run an audio interview through each of these services: Parrot AI, Otter AI, Google Pinpoint, Rev and Sonix.
We tested one professional recorded English interview, a Spanish interview recorded as an iPhone video, as well as a Zoom interview recorded with smartphones. We tested each of these in the transcription services to explore different formats, languages and background noise.
Here’s what we found.
If you’re looking to save time
This is often the top reason journalists seek a transcription service: If we can spend more time reporting and writing, and less time transcribing interviews, that’s a win. So, we timed how long it took for each service to upload and transcribe the 25-minute interview.
The quickest was Sonix. In just under 3 minutes, Sonix uploaded and transcribed the interview. Rev was not far behind at 4 minutes 30 seconds.
Another plus for multi-tasking journalists: Sonix, Rev and Otter send email alerts when your transcription is finished.
But speed only means so much if the transcript requires a lot of editing to clean it up for accuracy.
For English interviews with mostly clear audio, most of these services transcribed with few errors. Google Pinpoint performed the worst, as it does not separate speakers and often breaks up paragraphs poorly, which makes copying and pasting quotes difficult.
Rev did a good job picking up colloquialisms and breaking up sentences with commas, such as in this sentence: “And so we figured, hey, give ’em something from back home, have someone speak Spanish and have that environment that they had back home.”
Sonix similarly required very little editing, and it even knew to capitalize the name of a community center. Sonix also made it easy to export the final transcript to formats like Microsoft Word or text.
Otter’s transcription was decent, though it struggled with names and spelling of names. One big time-saver for Otter is the outline tool along the side. It’s still in beta, but this tool breaks down the major topics of your interview and makes it easy to navigate quickly.
Parrot had some inconsistencies that might be annoying for AP style devotees – such as writing fifth grade and 6th grade these two different ways, or missing capitalization – but it was mostly accurate.
Like Otter, Rev and Sonix – Parrot will automatically scroll and follow each word in the transcript as the audio plays. If you click on a word, the audio will pick up from that spot. However, unlike the others, this word-tracking doesn’t work well when editing.
Another time-saving note: In Otter, Rev and Sonix, you can add common words (jargon, acronyms, names of people, etc.) to a glossary or dictionary to teach the AI to recognize them each time you upload a recording. But that option is also limited in Otter’s free plan.
If you want to transcribe non-English interviews
Otter and Parrot are both English only, a disappointing limitation when so many languages are spoken worldwide.
Sonix and Rev each transcribed a Spanish interview clearly, identifying different speakers and requiring minimal edits — though more edits were needed than with the English interview.
If you need to transcribe a variety of different languages, Sonix might be the choice for you because it allows you to pick from a list of more than 38 languages in a dropdown menu before transcribing. While Rev offers 30 different languages in beta, you must remember to change your default language in settings each time.
Google Pinpoint has the ability to transcribe 13 different languages, but again, the separation of paragraphs and speakers was poor in the actual transcription.
A note: Don’t write off Google, yet. While our experiment focused on Pinpoint since that was marketed as designed for journalists, in one of our interviews we were told that Google Cloud has a speech-to-text tool that offers more than 80 different languages and has proven more accurate and better at distinguishing voices. Read more about that next week in our Q&A!
If you have a small budget
Parrot AI is completely free (for now) and would work well if you conduct a lot of interviews or meetings that need transcription.
While Otter offers a free version, this level has become more limited. Beyond the 30-minute conversation limit, it’s especially frustrating that Otter only allows three imported audio files on the free level. So, if you don’t record and transcribe in real time, you’re kind of out of luck.
Sonix and Rev both require subscriptions, or you can pay-as-you-go per minute.
If you want to transcribe in the field
While some reporters prefer to upload their recorded interviews into a computer, others may need to edit and transcribe while out in the field. If you want an option with a mobile app, your choices are: Otter and Rev.
Both allow you to record and transcribe in real time or import a file, but here’s what’s different: In Otter, almost all of the helpful editing and navigation features — like the outline, highlighting, commenting, sharing – are still available in the app.
Rev’s editing is more basic on the phone, but it offers free audio recording and call recording apps. (This was great news for someone who used to depend on TapeACall for recording phone interviews, which is now $10/month or $60/ year.) While the apps are free, you will still need to pay to transcribe the audio or phone call in Rev, either through a subscription or by the minute.
If you want to share transcriptions easily with a team
Each service offered a way to share transcriptions and audio recordings with team members through a link or by email. Many of them allow you to change the permissions so the other team members can edit or view only. Pinpoint was the exception in that team members could not edit the transcript after it was shared.
Unfortunately, none of the services – unless you want to pay for Otter Business Plan – allow multiple users on one account, though password sharing isn’t strictly prohibited.
If you do share an account, an important note for Rev is that more than one person cannot edit a transcript at a time. This feature is pretty sensitive, so even if you have Rev open in two tabs, it won’t let you edit in either tab.
If you’re concerned about data privacy
Otter, Rev and Sonix have clear security policies listed on their websites, and they each follow at least one well-known data security guideline or regulation.
The data security information for each service is linked here:
- Parrot AI: https://parrot.ai/terms
- Otter.ai: https://otter.ai/privacy-security
- Google Pinpoint: https://support.google.com/pinpoint/answer/11955675?hl=en
- Rev: https://www.rev.com/enterprise/security
- Sonix: https://sonix.ai/security
A reminder: Audio quality always matters
While technology is constantly improving, any AI transcription service may have a greater challenge if the audio is unclear or includes a lot of background noise. Some services will label words by the level of confidence or report its level of accuracy based on these factors.
Features breakdown by service
To help you pick the service that best fits your needs, here is the breakdown of what each service offers by feature.
Comments