论文信息 - Speech enhancement using combination of dereverberation and noise reduction for robust speech recognition

Speech enhancement using combination of dereverberation and noise reduction for robust speech recognition

In this paper, we describe a speech enhancement approach for robust speech recognition. This approach consists of two stages to solve both current problems of speech recognition: reverberation and noise. Firstly, speech signal is dereveberated by suppression of slowly -- varying components and the falling edge of the power envelope (SSF). Then a binaural speech processing is applied to remove noise from target speech. Speech recognition results show that this combination algorithm provides a good robustness in real environments.

Dang Khoa Nguyen | Quoc Cuong Nguyen | Tien Dung Tran | Huu Binh Nguyen | Thi Anh Xuan Tran

[1] Guy J. Brown,et al. Mask estimation for missing data speech recognition based on statistics of binaural interaction , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2] Parham Aarabi,et al. Real-time dual-microphone speech enhancement using field programmable gate arrays , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[3] Richard M. Stern,et al. Signal separation for robust speech recognition based on phase difference information obtained in the frequency domain , 2009, INTERSPEECH.

[4] Richard M. Stern,et al. Spatial separation of speech signals using amplitude estimation based on interaural comparisons of zero-crossings , 2009, Speech Commun..

[5] Richard M. Stern,et al. Binaural sound source separation motivated by auditory processing , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6] Hirokazu Kameoka,et al. Robust speech dereverberation based on non-negativity and sparse nature of speech spectrograms , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7] Hynek Hermansky,et al. Recognition of Reverberant Speech Using Frequency Domain Linear Prediction , 2008, IEEE Signal Processing Letters.

[8] Richard M. Stern,et al. Nonlinear enhancement of onset for robust speech recognition , 2010, INTERSPEECH.

[9] Guy J. Brown,et al. A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation , 2004, Speech Commun..

[10] R. Patterson,et al. Complex Sounds and Auditory Images , 1992 .

[11] DeLiang Wang,et al. Robust speech recognition by integrating speech separation and hypothesis testing , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[12] Le Xuan Hung,et al. Influence of F0 on Vietnamese syllable perception , 2005, INTERSPEECH.

[13] Richard M. Stern,et al. Gammatone sub-band magnitude-domain dereverberation for ASR , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14] Kazuya Takeda,et al. A binaural speech processing method using subband-cross correlation analysis for noise robust recognition , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[15] Parham Aarabi,et al. Phase-based dual-microphone robust speech enhancement , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).