Optimizing spectral subtraction and wiener filtering for robust speech recognition in reverberant and noisy conditions

Speech enhancement is a common approach to address the effects of degradation due to noise and channel contamination. This approach is intended to suppress unwanted signal and recover the clean speech. In this paper, we focus on two simple and low-computational methods: Wiener filtering (WF) and spectral subtraction (SS). Conventionally, these are formulated with no relation with automatic speech recognition (ASR). We propose to optimize the conventional speech enhancement technique in relation with likelihood of the acoustic model. We also exploit these simple speech enhancement techniques that are originally designed for denoising, to address reverberation as well. In the experiment with real noisy and reverberant environments, we have achieved significant improvement in recognition performance using the proposed approach.

[1]  Kiyohiro Shikano,et al.  Distant talking robust speech recognition using late reflection components of room impulse response , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Tatsuya Kawahara,et al.  Optimization of dereverberation parameters based on likelihood of speech recognizer , 2009, INTERSPEECH.

[3]  Hynek Hermansky,et al.  Data-Derived Non-Linear Mapping for Feature Extraction in HMM , 1999 .

[4]  Tai-Hwei Hwang,et al.  Feature adaptation using deviation vector for robust speech recognition in noisy environment , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Philipos C. Loizou,et al.  A multi-band spectral subtraction method for enhancing speech corrupted by colored noise , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  K. Shikano,et al.  Fast Dereverberation for Hands-Free Speech Recognition , 2008, 2008 Hands-Free Speech Communication and Microphone Arrays.

[7]  Eliathamby Ambikairajah,et al.  Wavelet transform-based speech enhancement , 1998, ICSLP.

[8]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[9]  Saeed Vaseghi,et al.  Advanced Signal Processing and Digital Noise Reduction , 1996 .

[10]  Antonio M. Peinado,et al.  Non-linear transformations of the feature space for robust Speech Recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.