Acoustic Model - Telephony-based Speech Recognition

Telephony-based Speech Recognition

The limiting factor for telephony based speech recognition is the bandwidth at which speech can be transmitted. For example, a standard land-line telephone only has a bandwidth of 64 kbit/s at a sampling rate of 8 kHz and 8-bits per sample (8000 samples per second * 8-bits per sample = 64000 bit/s). Therefore, for telephony based speech recognition, acoustic models should be trained with 8 kHz/8-bit speech audio files.

In the case of Voice over IP, the codec determines the sampling rate/bits per sample of speech transmission. Codecs with a higher sampling rate/bits per sample for speech transmission (which improve the sound quality) necessitate acoustic models trained with audio data that matches that sampling rate/bits per sample.

Read more about this topic:  Acoustic Model

Famous quotes containing the words speech and/or recognition:

    Let your speech be always with grace, seasoned with salt, that ye may know how ye ought to answer every man.
    Bible: New Testament Colossians 4:6.

    Justice begins with the recognition of the necessity of sharing. The oldest law is that which regulates it, and this is still the most important law today and, as such, has remained the basic concern of all movements which have at heart the community of human activities and of human existence in general.
    Elias Canetti (b. 1905)