Robust distributed speech recognition using two-stage Filtered Minima Controlled Recursive Averaging

This paper examines the use of a new Filtered Minima-Controlled Recursive Averaging (FMCRA) noise estimation technique as a robust front-end processing to improve the performance of a Distributed Speech Recognition (DSR) system in noisy environments. The noisy speech is enhanced by using a two-stage framework in order to simultaneously address the inefficiency of the Voice Activity Detector (VAD) and to remedy the inadequacies of MCRA. The performance evaluation carried out on the Aurora 2 task showed that the inclusion of FMCRA in the front-end side leads to a significant improvement in DSR accuracy.

[1]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.

[2]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[3]  David Malah,et al.  Speech enhancement using optimal non-linear spectral amplitude estimation , 1983, ICASSP.

[4]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[5]  Sofia Ben Jebara,et al.  Perceptual musical noise reduction using critical bands tonality coefficients and masking thresholds , 2007, INTERSPEECH.

[6]  Wonyong Sung,et al.  A voice activity detector employing soft decision based noise spectrum adaptation , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[7]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[8]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[9]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[10]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .