Acoustic diversity for improved speech recognition in reverberant environments

We show that even moderate reverberation has a detrimental effect on the audible quality of speech and automatic speech recognition (ASR) accuracy. In the presence of room reverberation, we assess the performance of several important speech enhancement techniques, and show that little improvement is offered. We experimentalIy show that multiple microphones are necessary for complete equalization of the speaker-to-receiver impulse response. Furthermore, if complete equalization is not possible, long reverberation time (RT60) is shown to affect ASR accuracy far more negatively than a low signal-to-reverberation ratio (SRR). Using this knowledge we develop an equalizing strategy that improves ASR accuracy by reducing RT60.

[1]  Henrique S. Malvar,et al.  Speech dereverberation via maximum-kurtosis subband adaptive filtering , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[2]  Helmut Haas,et al.  The Influence of a Single Echo on the Audibility of Speech , 1972 .

[3]  Ea-Ee Jan,et al.  Spatially selective sound capture for speech and audio processing , 1993, Speech Commun..

[4]  J. Flanagan,et al.  Computer‐steered microphone arrays for sound transduction in large rooms , 1985 .

[5]  James L. Flanagan,et al.  Optimal truncation time for matched filter array processing , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  Richard J. Renomeron,et al.  Small‐scale matched filter array processing for spatially selective sound capture , 1997 .

[7]  Michael S. Brandstein An event-based method for microphone array speech enhancement , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[8]  Wenqing Jiang,et al.  Adaptive Noise Reduction of Speech Signals , 2000 .

[9]  Maurizio Omologo,et al.  Experiments of speech recognition in a noisy and reverberant environment using a microphone array and HMM adaptation , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.