Telephony-based Speech Recognition
The limiting factor for telephony based speech recognition is the bandwidth at which speech can be transmitted. For example, a standard land-line telephone only has a bandwidth of 64 kbit/s at a sampling rate of 8 kHz and 8-bits per sample (8000 samples per second * 8-bits per sample = 64000 bit/s). Therefore, for telephony based speech recognition, acoustic models should be trained with 8 kHz/8-bit speech audio files.
In the case of Voice over IP, the codec determines the sampling rate/bits per sample of speech transmission. Codecs with a higher sampling rate/bits per sample for speech transmission (which improve the sound quality) necessitate acoustic models trained with audio data that matches that sampling rate/bits per sample.
Read more about this topic: Acoustic Model
Famous quotes containing the words speech and/or recognition:
“One thinking it is right to speak all things, whether the word is fit for speech or unutterable.”
—Sophocles (497406/5 B.C.)
“American feminists have generally stressed the ways in which men and women should be equal and have therefore tried to put aside differences.... Social feminists [in Europe] ... believe that men and society at large should provide systematic support to women in recognition of their dual role as mothers and workers.”
—Sylvia Ann Hewitt (20th century)