Custom Vocabulary (“CV”) enables developers to create a list of non-traditional words or phrases that may frequently appear in audio and have them more accurately represented in the Rev Automatic Speech Recognition (“ASR”) model output. These are brand names, uniquely spelled terms, or words that an everyday person may not use.
You also have the ability to:
Create a list to send along on a per job basis for the Asynchronous API;
Pre-compile a list of terms that can be referred to as a ‘Custom Vocabulary Id,’ that can be used for both the Asynchronous API and the Streaming API.
Some examples of terms that may appear on a CV list include: ‘Rev,’ or ‘fantabulous.’
You can find another General Questions document that outlines the benefits of Custom Vocabulary here.
This is an extremely powerful feature that is designed to empower accessibility and readability of ASR transcripts, and we wanted to put together this quick guide on how best to use this in your ASR workflow.
The number one rule to follow when using this feature: don’t speculate on terms that could appear in file, select the terms of high importance to you and your end users that need to be accurately represented in the output. While we understand you in some cases you may have a large number of jobs you need to send in batch and you don’t know all of the terms right away, it is better focus on the terms you know will persist and you want the model to get correct; additional terms that don’t appear in the audio may actually cause WER performance degradation. You can always update your CV list or pre-compiled list to include additional words that you recognize as important, but may have been missed by the ASR model.
Here is a list of some best practices that you can refer to when implementing CV in your workflow:
Create a targeted list of known or highly-likely terms: If your CV term often appears in a phrase, include a concise version (but representative - not paraphrased) of the phrase in your list as well;
Compile your CV list ahead of time: this helps with async performance (vs. sending as a list) and is the only manner through which you can use this with streaming;
Re-examine your audio and CV list: If you start to see WER rate increase after you leverage CV, narrow down the terms to focus on the ones of highest importance;
Be conscious of capitalization and casing; we want to represent your CV term in most accurate way (see Don’ts for a word of caution on this).
Mix languages in your CV term list: our ASR model is not multilingual, we strongly recommend you focus on the terms in the main source language of your audio;
Include common or well-known words/word spellings: these are likely already contained within the model lexicon;
Use sentences or long phrases: phrases will be boosted as unigram CV terms, but CV performance degrades past 5 words in phrase;
Use CV just for capitalization or stylistic elements of a term: while our ASR model will likely get it right, this feature is better served by focusing on the spelling of the specialized terms.
Here are some examples you might be considering adding to your CV list:
Good CV Entry
A unique name: if someone has a common first name and unusual last name (or vice versa), you can add it to CV as a phrase (instead of just adding the unusual name on it’s own)
E.g. “Rob Szypko”
Bad CV Entry
A long phrase: phrases longer than 5 words or have too many ‘common’ words may actually result in longer processing time and not necessarily the term-recognition specificity you are looking for. It is better to focus on the ‘unusual’ words to add to the CV, and is a good best practice to include shorter phrases with words that frequently appear next to the unusual words.
E.g. “getting started with food psych and body positivity”
This would be better served as a shorter phrase ‘food psych’.
We’re working hard here at Rev AI to make sure that we understand every human voice - and we’re always looking for input and great ideas to improve our speech recognition engine. If you have any questions, thoughts, or requests, let us know by contacting us at: firstname.lastname@example.org.
Interested in using Rev's ASR? Get in touch with us here.