A modified Ephraim-Malah noise suppression rule for automatic speech recognition

A soft decision gain modification is introduced and applied to the Ephraim-Malah gain function based on maximum mean square error estimation (MMSE) (Ephraim, Y. and Malah, D., IEEE Trans. Acoust. Speech Sig. Process., vol.ASSP-32, no.6, p.1109-21, 1984; vol.ASSP-33, no.2, p.443-5, 1985) after amplitude compression. Non-linear evaluations of the noise overestimation factor and spectral floor are used in the same way for the proposed gain modification and for non-linear spectral subtraction (NSS). Consistent and statistically significant ASR improvements of the proposed approach with respect to NSS are observed for different noise conditions considered in the AURORA2 and AURORA3 corpora. As the non-linearity affects the two approaches in the same way, the comparison result is particularly interesting.

[1]  Hanseok Ko,et al.  A novel spectral subtraction scheme for robust speech recognition: spectral subtraction using spectral harmonics of speech , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[2]  Jérôme Boudy,et al.  Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov models and the projection, for robust speech recognition in cars , 1991, Speech Commun..

[3]  Roberto Gemello,et al.  Robust multiple resolution analysis for automatic speech recognition , 2002, INTERSPEECH.

[4]  David Malah,et al.  Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[5]  I. Cohen Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator , 2002, IEEE Signal Processing Letters.

[6]  Volker Schless,et al.  SNR-dependent flooring and noise overestimation for joint application of spectral subtraction and model combination , 1998, ICSLP.

[7]  Marco Matassoni,et al.  Some experiments on the use of one-channel noise reduction techniques with the Italian SpeechDat Car database , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[8]  David Malah,et al.  Speech enhancement using optimal non-linear spectral amplitude estimation , 1983, ICASSP.

[9]  Christophe Beaugeant,et al.  Noise reduction using perceptual spectral change , 1999, EUROSPEECH.

[10]  Nam Soo Kim,et al.  Spectral enhancement based on global soft decision , 2000, IEEE Signal Processing Letters.

[11]  Roberto Gemello,et al.  Multi-source neural networks for speech recognition , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).