Beamforming - Beamforming For Speech Audio

Beamforming For Speech Audio

Beamforming can be used to try to extract sound sources in a room, such as multiple speakers in the cocktail party problem. This requires the locations of the speakers to be known in advance, for example by using the time of arrival from the sources to mics in the array, and inferring the locations from the distances.

It is useful to use specialized filter banks to separate frequency bands prior to beamforming. This is because different frequencies have different optimal beamform filters, so can be treated as separate problems. (i.e. run many filters in parallel, then recombine the bands.) Standard filters such as FFT bands are suboptimal for this purpose because they are not designed to isolate bands. For example, the FFT assumes implicitly that the only frequencies present in the signal are exactly those harmonics present as FFT harmonics. Frequencies which lie between these harmonics will typically activate all of the FFT channels, which is not what is wanted in a beamform analysis. Instead, filters can be designed in which only local frequencies are detected by each channel. The recombination property is also required: there must be enough information in these receptive field to reconstruct the signal. These basis are typically non-orthogonal, unlike the FFT basis.

Read more about this topic:  Beamforming

Famous quotes containing the word speech: