Survey on approaches to speech recognition in reverberant environments

This paper overviews the state of the art in reverberant speech processing from the speech recognition viewpoint. First, it points out that the key to successful reverberant speech recognition is to account for long-term dependencies between reverberant observations obtained from consecutive time frames. Then, a diversity of approaches that exploit the long-term dependencies in various ways is described, ranging from signal and feature dereverberation to acoustic model compensation tailored to reverberation. A framework for classifying those approaches is presented to highlight similarities and differences between them.

[1]  Philip C. Woodland,et al.  Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models , 1995, Comput. Speech Lang..

[2]  Eap Emanuël Habets Single- and multi-microphone speech dereverberation using spectral enhancement , 2007 .

[3]  Reinhold Häb-Umbach,et al.  Model-Based Feature Enhancement for Reverberant Speech Recognition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Shigeki Sagayama,et al.  Model Adaptation for Long Convolutional Distortion by Maximum Likelihood Based State Filtering Approach , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[5]  Richard M. Stern,et al.  Likelihood-maximizing beamforming for robust hands-free speech recognition , 2004, IEEE Transactions on Speech and Audio Processing.

[6]  Shinji Watanabe,et al.  Variance Compensation for Recognition of Reverberant Speech with Dereverberation Preprocessing , 2011, Robust Speech Recognition of Uncertain or Missing Data.

[7]  Les E. Atlas,et al.  Strategies for improving audible quality and speech recognition accuracy of reverberant speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[8]  Li Deng,et al.  Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech distortion , 2005, IEEE Transactions on Speech and Audio Processing.

[9]  Biing-Hwang Juang,et al.  Speech Dereverberation Based on Variance-Normalized Delayed Linear Prediction , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Masato Miyoshi,et al.  Inverse filtering of room acoustics , 1988, IEEE Trans. Acoust. Speech Signal Process..

[11]  Douglas L. Jones,et al.  Blind estimation of reverberation time. , 2003, The Journal of the Acoustical Society of America.

[12]  Heinrich Kuttruff,et al.  Room acoustics , 1973 .

[13]  Masafumi Nishimura,et al.  Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech , 2006, IEICE Trans. Inf. Syst..

[14]  Matthias Wölfel,et al.  Enhanced Speech Features by Single-Channel Joint Compensation of Noise and Reverberation , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[15]  Takuya Yoshioka,et al.  Blind Separation and Dereverberation of Speech Mixtures by Joint Optimization , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[16]  Steven Greenberg,et al.  Robust speech recognition using the modulation spectrogram , 1998, Speech Commun..

[17]  Richard Heusdens,et al.  Correlation-Based and Model-Based Blind Single-Channel Late-Reverberation Suppression in Noisy Time-Varying Acoustical Environments , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Roland Maas,et al.  Reverberation Model-Based Decoding in the Logmelspec Domain for Robust Distant-Talking Speech Recognition , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[19]  Yongqiang Wang,et al.  Improving reverberant VTS for hands-free robust speech recognition , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[20]  Tomohiro Nakatani,et al.  Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition , 2012, IEEE Signal Process. Mag..

[21]  Hans-Günter Hirsch,et al.  A new approach for the adaptation of HMMs to reverberation and background noise , 2008, Speech Commun..

[22]  J.-M. Boucher,et al.  A New Method Based on Spectral Subtraction for Speech Dereverberation , 2001 .

[23]  Walter Kellermann,et al.  TRINICON for Dereverberation of Speech and Audio Signals , 2010, Speech Dereverberation.

[24]  Nelson Morgan,et al.  Double the trouble: handling noise and reverberation in far-field automatic speech recognition , 2002, INTERSPEECH.