论文信息 - Statistical Model-Based Noise Reduction Approach for Car Interior Applications to Speech Recognition

Statistical Model-Based Noise Reduction Approach for Car Interior Applications to Speech Recognition

This paper presents a statistical model-based noise suppression approach for voice recognition in a car environment. In order to alleviate the spectral whitening and signal distortion problem in the traditional decisiondirected Wiener filter, we combine a decision-directed method with an original spectrum reconstruction method and develop a new two-stage noise reduction filter estimation scheme. When a tradeoff between the performance and computational efficiency under resource-constrained automotive devices is considered, ETSI standard advance distributed speech recognition font-end (ETSI-AFE) can be an effective solution, and ETSI-AFE is also based on the decision-directed Wiener filter. Thus, a series of voice recognition and computational complexity tests are conducted by comparing the proposed approach with ETSI-AFE. The experimental results show that the proposed approach is superior to the conventional method in terms of speech recognition accuracy, while the computational cost and frame latency are significantly reduced.

Ho-Young Jung | Yunkeun Lee | Hyung Soon Kim | Sung Joo Lee | Byung Ok Kang

[1] Jasha Droppo,et al. A noise-robust ASR front-end using Wiener filter constructed from MMSE estimation of clean speech and noise , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[2] Ho-Young Jung,et al. Discriminative noise adaptive training approach for an environment migration , 2007, INTERSPEECH.

[3] Norbert Wiener,et al. Extrapolation, Interpolation, and Smoothing of Stationary Time Series , 1964 .

[4] Nathalie Virag,et al. Single channel speech enhancement based on masking properties of the human auditory system , 1999, IEEE Trans. Speech Audio Process..

[5] Yifan Gong,et al. Speech recognition in noisy environments: A survey , 1995, Speech Commun..

[6] Wen-Rong Wu,et al. Subband Kalman filtering for speech enhancement , 1998 .

[7] Ho-Young Jung,et al. Model Adaptation Using Discriminative Noise Adaptive Training Approach for New Environments , 2008 .

[8] Hyung Soon Kim,et al. Narrowband to wideband conversion of speech using GMM based transformation , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9] Anshu Agarwal,et al. TWO-STAGE MEL-WARPED WIENER FILTER FOR ROBUST SPEECH RECOGNITION , 1999 .

[10] Jerry D. Gibson,et al. Filtering of colored noise for speech enhancement and coding , 1991, IEEE Trans. Signal Process..

[11] Denis Jouvet,et al. Evaluation of a noise-robust DSR front-end on Aurora databases , 2002, INTERSPEECH.

[12] Speech Processing , Transmission and Quality Aspects ( STQ ) ; Test Methodologies for ETSI Test Events and Results ; Part 2 : 1 st ETSI Plugtests Speech Quality Test Event Report , 2022 .

[13] S. Boll,et al. Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[14] Alexander Kain,et al. Spectral voice conversion for text-to-speech synthesis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[15] David Pearce,et al. A robust front-end algorithm for distributed speech recognition , 2001, INTERSPEECH.

[16] Young-Joo Suh,et al. Feature Compensation Combining SNR-Dependent Feature Reconstruction and Class Histogram Equalization , 2008 .

[17] A.V. Oppenheim,et al. Enhancement and bandwidth compression of noisy speech , 1979, Proceedings of the IEEE.

[18] David Malah,et al. Speech enhancement using a minimum mean-square error log-spectral amplitude estimator , 1984, IEEE Trans. Acoust. Speech Signal Process..

[19] Masanori Tsujikawa,et al. Model-Basedwiener Filter for Noise Robust Speech Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[20] Yariv Ephraim,et al. Statistical-model-based speech enhancement systems , 1992, Proc. IEEE.

[21] Ho-Young Jung,et al. A Commercial Car Navigation System using Korean Large Vocabulary Automatic Speech Recognizer , 2009 .

[22] Hamid Sheikhzadeh,et al. HMM-based strategies for enhancement of speech signals embedded in nonstationary noise , 1998, IEEE Trans. Speech Audio Process..

[23] Ephraim. Speech enhancement using a minimum mean square error short-time spectral amplitude estimator , 1984 .