Acoustic Model - Telephony-based Speech Recognition

Telephony-based Speech Recognition

The limiting factor for telephony based speech recognition is the bandwidth at which speech can be transmitted. For example, a standard land-line telephone only has a bandwidth of 64 kbit/s at a sampling rate of 8 kHz and 8-bits per sample (8000 samples per second * 8-bits per sample = 64000 bit/s). Therefore, for telephony based speech recognition, acoustic models should be trained with 8 kHz/8-bit speech audio files.

In the case of Voice over IP, the codec determines the sampling rate/bits per sample of speech transmission. Codecs with a higher sampling rate/bits per sample for speech transmission (which improve the sound quality) necessitate acoustic models trained with audio data that matches that sampling rate/bits per sample.

Read more about this topic:  Acoustic Model

Famous quotes containing the words speech and/or recognition:

    On me your voice falls as they say love should,
    Like an enormous yes. My Crescent City
    Is where your speech alone is understood.
    Philip Larkin (1922–1986)

    Design in art, is a recognition of the relation between various things, various elements in the creative flux. You can’t invent a design. You recognise it, in the fourth dimension. That is, with your blood and your bones, as well as with your eyes.
    —D.H. (David Herbert)