Acoustic Model - Desktop-based Speech Recognition

Desktop-based Speech Recognition

For speech recognition on a standard desktop PC, the limiting factor is the sound card. Most sound cards today can record at sampling rates of between 16 kHz-48 kHz of audio, with bit rates of 8 to 16-bits per sample, and playback at up to 96 kHz.

As a general rule, a speech recognition engine works better with acoustic models trained with speech audio data recorded at higher sampling rates/bits per sample. But using audio with too high a sampling rate/bits per sample can slow the recognition engine down. A compromise is needed. Thus for desktop speech recognition, the current standard is acoustic models trained with speech audio data recorded at sampling rates of 16 kHz/16bits per sample.

Read more about this topic:  Acoustic Model

Famous quotes containing the words speech and/or recognition:

    There are certain things in which mediocrity is intolerable: poetry, music, painting, public eloquence. What torture it is to hear a frigid speech being pompously declaimed, or second-rate verse spoken with all a bad poet’s bombast!
    —Jean De La Bruyère (1645–1696)

    I shall earnestly and persistently continue to urge all women to the practical recognition of the old Revolutionary maxim. “Resistance to tyranny is obedience to God.”
    Susan B. Anthony (1820–1906)