Robust Speech Enhancement Using Two-Stage Filtered Minima Controlled Recursive Averaging

In this paper we propose an algorithm for estimating noise in highly non-stationary noisy environments, which is a challenging problem in speech enhancement. This method is based on minima-controlled recursive averaging (MCRA) whereby an accurate, robust and efficient noise power spectrum estimation is demonstrated. We propose a two-stage technique to prevent the appearance of musical noise after enhancement. This algorithm filters the noisy speech to achieve a robust signal with minimum distortion in the first stage. Subsequently, it estimates the residual noise using MCRA and removes it with spectral subtraction. The proposed Filtered MCRA (FMCRA) performance is evaluated using objective tests on the Aurora database under various noisy environments. These measures indicate the higher output SNR and lower output residual noise and distortion.

[1]  David Malah,et al.  Speech enhancement using optimal non-linear spectral amplitude estimation , 1983, ICASSP.

[2]  I. Cohen,et al.  Noise estimation by minima controlled recursive averaging for robust speech enhancement , 2002, IEEE Signal Processing Letters.

[3]  Sofia Ben Jebara,et al.  Perceptual musical noise reduction using critical bands tonality coefficients and masking thresholds , 2007, INTERSPEECH.

[4]  Rainer Martin,et al.  Noise power spectral density estimation based on optimal smoothing and minimum statistics , 2001, IEEE Trans. Speech Audio Process..

[5]  Wonyong Sung,et al.  A voice activity detector employing soft decision based noise spectrum adaptation , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[6]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[7]  Israel Cohen,et al.  Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging , 2003, IEEE Trans. Speech Audio Process..

[8]  Philipos C. Loizou,et al.  Speech Enhancement: Theory and Practice , 2007 .

[9]  Andries P. Hekstra,et al.  Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[10]  Richard M. Schwartz,et al.  Enhancement of speech corrupted by acoustic noise , 1979, ICASSP.