Which speech-to-text APIs have a free tier?

Deepgram, AssemblyAI, Google Cloud Speech-to-Text, Amazon Transcribe, Speechmatics, Gladia, Rev AI offer a free tier or trial credits.

How is the APIbenchmarks Index calculated?

The ABI is a weighted composite of four dimensions scored on absolute reference scales: documentation & DX (30%), reliability (25%), ecosystem & SDKs (25%), and accessibility (20%). Price is excluded from the composite because price units are not comparable across categories. The full formula is on the methodology page.

Category report · 8 providers evaluated

Best Speech-to-Text APIs

Speech-to-Text APIs convert audio into text, spanning batch (async file) and real-time streaming transcription, with add-ons like speaker diarization, translation, and PII redaction. The category splits into focused voice-AI specialists (Deepgram, AssemblyAI, Speechmatics, Gladia, Rev AI) optimized for accuracy, latency, and generous self-serve free tiers, versus hyperscaler platforms (Google, AWS) and the model-API generalist (OpenAI Whisper) that ride massive infrastructure but offer thinner DX and stingier free tiers. Compare on documentation/DX quality, reliability and proven scale, SDK breadth and ecosystem, and how fast a developer or AI agent can self-serve a working key against transparent public pricing.

Highest rated

Deepgram

Real-time streaming STT for voice agents

87.5

ABI

VerdictWhat is the best speech-to-text API?The short answer, plus which provider wins on each axis.Read the verdict →

What is the best Speech-to-Text API?

#	Provider	Documentation	Reliability	Ecosystem	Accessibility	ABI	Free
1	DeepgramDeepgram	90	84	82	95	87.5A	Yes
2	AssemblyAIAssemblyAI	92	82	83	90	86.9A	Yes
3	OpenAI Whisper / GPT-4o TranscribeOpenAI	85	80	88	82	83.9B	No
4	Google Cloud Speech-to-TextGoogle	78	92	85	68	81.3B	Yes
5	Amazon TranscribeAWS	74	93	84	62	78.9B	Yes
6	SpeechmaticsSpeechmatics	76	80	68	84	76.6B	Yes
7	GladiaGladia	78	70	62	88	74.0C	Yes
8	Rev AIRev	74	74	60	85	72.7C	Yes

Table 1. Best Speech-to-Text APIs ranked by the APIbenchmarks Index. Specification columns are vendor-stated; ABI is computed per the published methodology.

Composite scores

Deepgram

87.5

AssemblyAI

86.9

OpenAI Whisper / GPT-4o Transcribe

83.9

Google Cloud Speech-to-Text

81.3

Amazon Transcribe

78.9

Speechmatics

76.6

Gladia

74.0

Rev AI

72.7

Scale 0–100. Highest in category: 87.5.

Figure 1. APIbenchmarks Index for Speech-to-Text APIs, bar length proportional to composite score; colour encodes letter grade.

Provider scorecards

1. DeepgramAABI 87.5 · Excellent

Voice-AI specialist with the Nova-3 and Flux models, known for sub-300ms streaming latency and a developer-first console.

Documentation & DX

Reliability

Ecosystem & SDKs

Accessibility

2. AssemblyAIAABI 86.9 · Excellent

Research-driven STT with the Universal and Slam-1 models and a deep audio-intelligence add-on stack (sentiment, topics, LeMUR LLM).

Documentation & DX

Reliability

Ecosystem & SDKs

Accessibility

3. OpenAI Whisper / GPT-4o TranscribeBABI 83.9 · Strong

Transcription endpoints (whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe) bundled into the broader OpenAI API; simple flat per-minute pricing, no STT-specific free tier.

Documentation & DX

Reliability

Ecosystem & SDKs

Accessibility

4. Google Cloud Speech-to-TextBABI 81.3 · Strong

Hyperscaler STT (Chirp models) with 125+ languages, contractual enterprise SLAs and GCP-wide infrastructure, but heavier console onboarding.

Documentation & DX

Reliability

Ecosystem & SDKs

Accessibility

5. Amazon TranscribeBABI 78.9 · Strong

AWS-native STT with volume tiering, deep IAM/S3 integration and proven hyperscaler reliability; powerful but verbose AWS-style docs and console.

Documentation & DX

Reliability

Ecosystem & SDKs

Accessibility

6. SpeechmaticsBABI 76.6 · Strong

UK-based accuracy and multilingual specialist (55+ languages, strong accent coverage) with batch, real-time, and on-prem deployment options.

Documentation & DX

Reliability

Ecosystem & SDKs

Accessibility

7. GladiaCABI 74.0 · Solid

European audio-infrastructure challenger wrapping Whisper-grade accuracy with all features (diarization, translation, code-switching) included at every tier.

Documentation & DX

Reliability

Ecosystem & SDKs

Accessibility

8. Rev AICABI 72.7 · Solid

STT arm of transcription company Rev, offering the Reverb and Whisper models plus optional human transcription; solid but a narrower SDK set.

Documentation & DX

Reliability

Ecosystem & SDKs

Accessibility

Frequently asked questions

What is the best Speech-to-Text API?: By the APIbenchmarks Index, Deepgram rates highest (ABI 87.5, grade A). Real-time streaming STT for voice agents The ABI weights documentation, reliability, ecosystem, and accessibility; price is reported separately, so the right pick still depends on your budget and workload.
Which speech-to-text APIs have a free tier?: Deepgram, AssemblyAI, Google Cloud Speech-to-Text, Amazon Transcribe, Speechmatics, Gladia, Rev AI offer a free tier or trial credits.
How is the APIbenchmarks Index calculated?: The ABI is a weighted composite of four dimensions scored on absolute reference scales: documentation & DX (30%), reliability (25%), ecosystem & SDKs (25%), and accessibility (20%). Price is excluded from the composite because price units are not comparable across categories. The full formula is on the methodology page.

Popular comparisons

Deepgram vs AssemblyAI Deepgram vs OpenAI Whisper / GPT-4o Transcribe Deepgram vs Google Cloud Speech-to-Text AssemblyAI vs OpenAI Whisper / GPT-4o Transcribe AssemblyAI vs Google Cloud Speech-to-Text OpenAI Whisper / GPT-4o Transcribe vs Google Cloud Speech-to-Text

What is the best Speech-to-Text API?

Composite scores

Provider scorecards

Frequently asked questions

Popular comparisons

References