An improved mel-wiener filter for mel-LPC based speech recognition

We previously proposed a Mel-Wiener filter to enhance Mel-LPC spectra in presence of additive noise. The proposed filter was estimated based on minimization of sum of square error on the linear frequency scale and efficiently implemented in the autocorrelation domain without denoising input speech. In the previously proposed system we segregated speech and noise using an energy based VAD and a very simple flooring technique were used for noise segment. In this present work, we improve the VAD using autoregressive (AR) model of noise and flooring technique as well. In addition, a lag window is applied to the estimated noise autocorrelation function to smooth the fine spectra of high order autocorrelation coefficients. As a result, substantial improvement is obtained over previous result.

[1]  Denis Jouvet,et al.  Evaluation of a noise-robust DSR front-end on Aurora databases , 2002, INTERSPEECH.

[2]  Hiroshi Matsumoto,et al.  An efficient mel-LPC analysis method for speech recognition , 1998, ICSLP.

[3]  Jinyu Li,et al.  A complexity reduction of ETSI advanced front-end for DSR , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[5]  S. Boll,et al.  Suppression of acoustic noise in speech using spectral subtraction , 1979 .

[6]  H. Strube Linear prediction on a warped frequency scale , 1980 .

[7]  Alan V. Oppenheim,et al.  Discrete representation of signals , 1972 .

[8]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[9]  Laurent Mauuary,et al.  Blind equalization in the cepstral domain for robust telephone based speech recognition , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[10]  Anshu Agarwal,et al.  TWO-STAGE MEL-WARPED WIENER FILTER FOR ROBUST SPEECH RECOGNITION , 1999 .

[11]  Jérôme Boudy,et al.  Experiments with a nonlinear spectral subtractor (NSS), Hidden Markov models and the projection, for robust speech recognition in cars , 1991, Speech Commun..

[12]  B.-H. Juang,et al.  On the hidden Markov model and dynamic time warping for speech recognition — A unified view , 1984, AT&T Bell Laboratories Technical Journal.