Voice or speech recognition is the ability of a machine or program to receive and interpret dictation, or to understand and carry out spoken commands.
For use with computers, analog audio must be converted into digital signals. This requires analog-to-digital conversion. For a computer to decipher the signal, it must have a digital database, or vocabulary, of words or syllables, and a speedy means of comparing this data with signals. The speech patterns are stored on the hard drive and loaded into memory when the program is run. A comparator checks these stored patterns against the output of the A/D converter.
In practice, the size of a voice-recognition program's effective vocabulary is directly related to the random access memory capacity of the computer in which it is installed. A voice-recognition program runs many times faster if the entire vocabulary can be loaded into RAM, as compared with searching the hard drive for some of the matches. Processing speed is critical as well, because it affects how fast the computer can search the RAM for matches.
All voice-recognition systems or programs make errors. Screaming children, barking dogs, and loud external conversations can produce false input. Much of this can be avoided only by using the system in a quiet room. There is also a problem with words that sound alike but are spelled differently and have different meanings -- for example, "hear" and "here." This problem might someday be largely overcome using stored contextual information. However, this will require more RAM and faster processors than are currently available in personal computers.
Though a number of voice recognition systems are available on the market, the industry leaders are IBM and Dragon Systems.