Maximum speech reputation apps don’t have any bother transcribing a local speaker being recorded with a professional microphone in a quiet room. This isn’t a problem.
So that you could check them extra completely, I created a “nightmare” recording of 2 non-native audio system with loud town background noise.
How did they fare?
Let’s in finding out.

Otter was once one of the crucial ceaselessly discussed answers once we requested for ideas on Twitter and within the Ahrefs group. And for just right reason why. It’s simple to arrange, has an intuitive interface, and provides transparent pricing.
Distinctive options
What stands proud from the remaining is the app’s talent to file on-line conferences and transcribe them—just by pasting the assembly URL. However you’ll additionally import a video/audio record or file audio proper within the app.
But even so, you’ll attach your calendar to by no means leave out a gathering.
Transcript high quality
I were given first rate effects, however there was once so much to edit too.
It didn’t get some names proper. However I will be able to’t blame any software for no longer choosing up “Ahrefs” or “Tim Soulo” 100% of the time.

Something I discovered is that once it notified the transcriptions had been in a position, it could nonetheless do one thing within the background (modify time stamps, tag audio system, and so forth.). Like a scholar nonetheless scribbling on a check paper whilst passing it to the instructor.
Pricing
You’ll be able to get started without spending a dime and improve to a paid plan later. You’ll be able to import as much as 3 recordsdata and file 290 mins of conferences sooner than you want to improve (as of April 2023).

Putting in an account was once a no brainer. I discovered the interface simple to navigate as smartly. One non-public statement is that it felt a little bit too “chilly” to make use of since I noticed such things as “Position Order,” “Billing,” and “Bill” manner too incessantly.
You could get an affect that it was once designed through an accounting staff (versus Descript that comes subsequent on this roundup).
Distinctive options
But even so auto-generated transcripts, Rev gives reside captions for Zoom conferences. You additionally find a way to put an order for human transcriptions.
Transcript high quality
Deficient audio with town noise was once a bit of an excessive amount of for Rev. Some phrases had been lacking, whilst others had been misrecognized. Consequently, some paragraphs didn’t make a lot sense, whilst others had been effective.

Pricing
You’ll be able to transcribe the primary audio record (as much as 45 mins) without spending a dime. I were given a invoice for $1.25 with a bargain that ended in a complete of $0.00. Thank you, accounting staff. 😉
Rev additionally has a 14-day trial of its paid plan. However that was once difficult to search out. To find it, you want to visit the footer of the homepage and search for it beneath “Products and services.”


Descript welcomed me through title (which was once a pleasant accident). The primary factor it’s a must to know is that this can be a standalone instrument quite than a internet carrier. It’s a lot more than a speech-to-text converter. It’s mainly a video enhancing software. And there’s no doubt a finding out curve. However fortunately, onboarding is terribly humorous and attractive.

Distinctive options
As I discussed, Descript is extra of a video enhancing software this is just right with transcribing. I’d name it “Canva for video/captions.” You’ll be able to upload B-rolls, results, animations, and extra.
You’ll be able to simply drag and drop and mainly produce an entire video with its lend a hand. However for those who simply want a transcript or captions of a video or audio, you’ll do this too.
Transcript high quality
My pattern audio had slightly muddy effects. Now and then, it had problem spotting abbreviations (e.g., search engine optimization). I additionally had an issue with disposing of filler phrases like “uh” and “um.”
I discovered that if I didn’t make a choice an choice to take away them, they, um, simply stayed there despite the fact that I didn’t want them as a rule. But when I did make a choice to take away them, it from time to time consumed portions of different phrases, inflicting much more bother.
Additionally, it couldn’t acknowledge portions {that a} human being would don’t have any drawback working out simply from context, e.g., “Jack of all trades” become ‘“jackal, trades.”
At the vivid facet, I imagine you’ll nonetheless perceive what the textual content is about.

Pricing
You’ll be able to get started with fundamental purposes without spending a dime and improve if wanted.

MacWhisper is a transcription software powered through Whisper. It’s an automated speech reputation (ASR) device advanced through OpenAI, the similar corporate that introduced us ChatGPT.
As OpenAI states on its web site:
Whisper is skilled on 680,000 hours of multilingual and multitask supervised knowledge gathered from the internet.
Whisper isn’t one thing you’ll merely “run” as is. What’s extra, it’s lovely difficult to arrange for those who do wish to run it your self. Github, Python—you get the gist.
Happily, there are gear like MacWhisper that take this off your shoulders and assist you to use the ability of AI in a easy consumer interface.
Distinctive options
Simply simple speech-to-text reputation with time stamps. Sadly, it doesn’t auto-tag the audio system.
Transcript high quality
While you run the software, you might have to select a “type” to paintings with. Principally, the lighter the type, the faster it is going to run. However greater fashions will produce higher effects. Additionally, in MacWhisper, the ones greater (higher however slower) fashions are most effective to be had within the paid model.
I determined initially the unfastened “small” type, which was once said to have “commonplace velocity with just right accuracy.”
It was once OK, however no higher than the competition. I believed it might paintings effective with top of the range audio, however no longer with the terrible examples I fed to it.
“AI is overestimated,” I assumed. However sooner than ultimate the Mac and switching again to my pricey Home windows PC, I determined to provide the “huge” type a check out.
And you understand what, AI isn’t overestimated. I discovered the effects to be a lot better than the rest.
The transcript was once in point of fact, in point of fact just right. It even were given such things as “Ahrefs” and “SaaS” proper! Regardless that nonetheless no longer 100% of the time.

Pricing
You’ll be able to run smaller fashions without spending a dime. For a big type, you’ll want to acquire a license.

This software is the very best to make use of. Merely drag and drop your record—then it’s in a position. It takes a while to procedure, despite the fact that.
Distinctive options
Not anything but even so downloading a transcription.
Transcript high quality
My first affect was once that the effects had been absolute best as a result of, visually, it delivered a confident-looking textual content:

However after proofreading, I spotted that it merely didn’t come with the portions it failed to acknowledge—every so often a number of phrases in a row.
Pricing
It’s unfastened to use.

Premiere Professional isn’t precisely a “transcription software” however quite a video enhancing instrument. I’m together with it as a result of I suppose that some firms would possibly have already got it of their arsenal (like we do).
To get to the transcription characteristic in Premiere Professional, simply cross to the “Captions and graphics” workspace and click on “Create transcription.”

Distinctive options
If we take most effective speech reputation under consideration right here, what it does smartly is developing exact time stamps, auto-tagging the audio system and, if wanted, robotically including an editable captions observe to a video venture.
Transcript high quality
Let’s be simple: I discovered the noisy audio transcript to be a failure. I couldn’t comprehend what other folks had been speaking about within the first position.

Nonetheless, I believe this option may also be in point of fact useful in case you are developing captions from top of the range audio. I used it myself a number of occasions and had not anything to bitch about when the recording high quality was once just right.
Pricing
You want an Adobe Inventive Cloud subscription to make use of Premiere Professional.

Whilst signing up and importing recordsdata is quite simple, it’s a must to spend a while answering questions on you and your corporate sooner than you’ll in spite of everything get to the software itself. And no, you’ll’t skip typing on your corporate title, your function, and your corporate dimension.
However while you get thru this, the interface is blank and intuitive.
Distinctive options
You’ll be able to generate a transcript or captions for video or audio. There could also be an solution to request a handbook assessment of the transcript. However, you’ll generate subtitles in a special language, so you might have transcription and translation in a single click on.

Transcript high quality
Satisfied Scribe did a in point of fact just right task transcribing the audio. It had no drawback with phrases like “search engine optimization” and “SaaS” (clearly the weakest level for plenty of gear). It will additionally auto-tag the audio system, which may well be useful in positive eventualities.

Pricing
I may just check one record without spending a dime. After that, I’d want to purchase credit for use for each and every minute of video or audio transcribed.

Sonix is a device for automated transcriptions, translations, and integration with assembly apps.
Distinctive options
But even so conferences integration, which is sort of a given for many gear, AI abstract technology is an engaging characteristic (in beta as of April 2023.) However I already were given spectacular effects from it.

You additionally get some further gear to paintings with video captions—a timeline view and an solution to break up captions into a number of traces. You’ll be able to additionally import an present transcript, and Sonix will sync it with the audio.
Transcript high quality
Sonix has a customized vocabulary characteristic. I discovered that helped a bit of with names like “Tim Soulo” and “Ahrefs,” however it didn’t paintings 100% of the time. It most commonly did smartly. However from time to time, it mistook search engine optimization for CEO and returned the phrase “Excel” apparently out of nowhere.
The transcript made sense on the whole however required slightly numerous edits if it had to be absolute best.

Pricing
Sonix has a unfastened trial for 25 mins of transcriptions. After that, you want to buy pay-as-you-go credit or get a subscription.

Notta is but some other transcription carrier that works for each real-time conferences and present recordings.
Distinctive options
But even so transcription, Notta makes a speciality of streamlining positive workflows and provides options similar to calendar sync and scheduler (in beta as of April 2023).
Transcript high quality
Background noise and deficient audio high quality weren’t deal breakers for Notta. The transcription effects became out most commonly OK however nonetheless had some issues.

Sentence construction was once every so often a bit of bizarre, positive phrases went lacking, and my favourite “Jack of all trades” section wasn’t that neat this time.

Any other factor price noting is that, for some reason why, it failed to acknowledge two audio system, and the entire interview was once tagged as “Speaker 1.”
Pricing
You’ll be able to get started with a unfastened fundamental subscription and check out a three-day trial of the paid plan, Notta Professional.
Ultimate ideas
As you’ll see, there are many gear to choose between. Nonetheless, it kind of feels that OpenAI stirred issues up a bit of through freeing a unfastened ASR (automated speech reputation) device, which I discovered to be significantly extra succesful than others.
However natural speech reputation high quality is only one issue. Perhaps you do want to file your Zoom conferences (Otter), paintings with captions in a big video venture (Premiere Professional), or temporarily create a Canva-style video (Descript).
Additionally, I want to tension that I used to be seeking to push those gear to the brink through giving them the worst-case situation recording. For extra herbal makes use of, the diversities within the result may well be a lot much less noticeable.
It’s nice to look that there are such a large amount of choices in the market, and I am hoping this assessment will lend a hand a bit of find the person who is absolute best for you.
Were given questions? Ping me on Twitter.