Definition

speech recognition

Contributor(s): Karolina Kiwak

Speech recognition is the ability of a machine or program to identify words and phrases in spoken language and convert them to a machine-readable format. Rudimentary speech recognition software has a limited vocabulary of words and phrases, and it may only identify these if they are spoken very clearly. More sophisticated software has the ability to accept natural speech.

How it works

Speech recognition works using algorithms through acoustic and language modeling. Acoustic modeling represents the relationship between linguistic units of speech and audio signals; language modeling matches sounds with word sequences to help distinguish between words that sound similar.

Often, hidden Markov models are used as well to recognize temporal patterns in speech to improve accuracy within the system.

Applications

The most frequent applications of speech recognition within the enterprise include call routing, speech-to-text processing, voice dialing and voice search.

Pros and cons

While convenient, speech recognition technology still has a few issues to work through, as it is continuously developed. The pros of speech recognition software are it is easy to use and readily available. Speech recognition software is now frequently installed in computers and mobile devices, allowing for easy access.


Speech recognition offers a way to
communicate with the technology
around us.

The downside of speech recognition includes its inability to capture words due to variations of pronunciation, its lack of support for most languages outside of English and its inability to sort through background noise. These factors can lead to inaccuracies.

Performance

Speech recognition performance is measured by accuracy and speed. Accuracy is measured with word error rate. WER works at the word level and identifies inaccuracies in transcription, although it cannot identify how the error occurred. Speed is measured with the real-time factor. A variety of factors can affect computer speech recognition performance, including pronunciation, accent, pitch, volume and background noise.

It is important to note the terms speech recognition and voice recognition are sometimes used interchangeably. However, the two terms mean different things. Speech recognition is used to identify words in spoken language. Voice recognition is a biometric technology used to identify a particular individual's voice or for speaker identification.

This was last updated in December 2016

Continue Reading About speech recognition

Dig Deeper on Call center speech technology

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

How do you use speech recognition in your day-to-day life?
Cancel

-ADS BY GOOGLE

File Extensions and File Formats

Powered by:

SearchBusinessAnalytics

SearchDataManagement

SearchSAP

SearchOracle

SearchAWS

SearchContentManagement

SearchSalesforce

Close