Acoustic Model - Desktop-based Speech Recognition

Desktop-based Speech Recognition

For speech recognition on a standard desktop PC, the limiting factor is the sound card. Most sound cards today can record at sampling rates of between 16 kHz-48 kHz of audio, with bit rates of 8 to 16-bits per sample, and playback at up to 96 kHz.

As a general rule, a speech recognition engine works better with acoustic models trained with speech audio data recorded at higher sampling rates/bits per sample. But using audio with too high a sampling rate/bits per sample can slow the recognition engine down. A compromise is needed. Thus for desktop speech recognition, the current standard is acoustic models trained with speech audio data recorded at sampling rates of 16 kHz/16bits per sample.

Read more about this topic:  Acoustic Model

Famous quotes containing the words speech and/or recognition:

    True and false are attributes of speech not of things. And where speech is not, there is neither truth nor falsehood. Error there may be, as when we expect that which shall not be; or suspect what has not been: but in neither case can a man be charged with untruth.
    Thomas Hobbes (1588–1679)

    In a cabinet of natural history, we become sensible of a certain occult recognition and sympathy in regard to the most unwieldy and eccentric forms of beast, fish, and insect.
    Ralph Waldo Emerson (1803–1882)