Acoustic Model - Telephony-based Speech Recognition

Telephony-based Speech Recognition

The limiting factor for telephony based speech recognition is the bandwidth at which speech can be transmitted. For example, a standard land-line telephone only has a bandwidth of 64 kbit/s at a sampling rate of 8 kHz and 8-bits per sample (8000 samples per second * 8-bits per sample = 64000 bit/s). Therefore, for telephony based speech recognition, acoustic models should be trained with 8 kHz/8-bit speech audio files.

In the case of Voice over IP, the codec determines the sampling rate/bits per sample of speech transmission. Codecs with a higher sampling rate/bits per sample for speech transmission (which improve the sound quality) necessitate acoustic models trained with audio data that matches that sampling rate/bits per sample.

Read more about this topic:  Acoustic Model

Famous quotes containing the words speech and/or recognition:

    One thinking it is right to speak all things, whether the word is fit for speech or unutterable.
    Sophocles (497–406/5 B.C.)

    American feminists have generally stressed the ways in which men and women should be equal and have therefore tried to put aside differences.... Social feminists [in Europe] ... believe that men and society at large should provide systematic support to women in recognition of their dual role as mothers and workers.
    Sylvia Ann Hewitt (20th century)