Rev.ai offers the most accurate speech-to-text (STT) APIs to meet your business and technical needs: asynchronous STT and streaming STT.

Asynchronous STT

Asynchronous STT produces transcripts from pre-recorded audio files. Our features include:

  • Global English coverage - our single English model supports all major English accents. 
  • Best in class accuracy - we train our speech models on tens of thousands of hours of human-transcribed audio content. The result? We offer the most accurate speech recognition service on the market that boasts the lowest word error rate (WER) of the competition. We beat Google, Amazon, Microsoft, and Speechmatics in our benchmarking tests.  
  • Advanced punctuation, capitalization, and inverse text normalization (ITN) - transcripts are automatically polished so they're readable and easy to understand
  • Verbatim - capture every word that is said, including "ums" and "uhs"
  • Timestamps - receive a timestamp for every word that is said
  • Speaker diarization - recognize multiple speakers and attribute text to each
  • Speaker channel support - process multi-channel audio on distinct channels
  • Custom vocabulary - share unique terms with us so we can capture them 
  • Deployment options - deploy Rev.ai in the cloud or on-prem 

Need more technical details? Read the documentation for our async API. 

Streaming STT

Streaming STT produces transcripts in real-time as people are speaking. Our features include:

  • Global English coverage - our single English model supports all major English accents 
  • Best in class accuracy - we train our speech models on tens of thousands of hours of human-transcribed audio content. The result? We offer the most accurate speech recognition service on the market that boasts the lowest word error rate (WER) of the competition. We beat Google, Amazon, Microsoft, and Speechmatics in our benchmarking tests.  
  • Advanced punctuation, capitalization, and inverse text normalization (ITN) - transcripts are automatically polished so they're readable and easy to understand 
  • Verbatim - capture every word that is said, including "ums" and "uhs"
  • Timestamps - receive a timestamp for every word that is said
  • Custom vocabulary - share unique terms with us so we can capture them 
  • Deployment options - deploy Rev.ai in the cloud (we plan to offer on-prem in the future)

Need more technical details? Read the documentation for our streaming API. 

Did this answer your question?