Noise Robust Feature Extraction Based on Extended Weighted Linear Prediction in LVCSR

This paper introduces extended weighted linear prediction (XLP) to noise robust short-time spectrum analysis in the feature extraction process of a speech recognition system. XLP is a generalization of standard linear prediction (LP) and temporally weighted linear prediction (WLP) which have already been applied to noise robust speech recognition with good results. With XLP, higher controllability to the temporal weighting of different parts of the noisy speech is gained by taking the lags of the signal into account in prediction. Here, the performance of XLP is put up against WLP and conventional spectrum analysis methods FFT and LP on a large vocabulary continuous speech recognition (LVCSR) scheme using real world noisy data containing additive and convolutive noise. The results show improvements over the reference methods in several cases. Index Terms: linear prediction, temporal weighting, noise robust, speech recognition

[1]  Mark J. F. Gales,et al.  Semi-tied covariance matrices for hidden Markov models , 1999, IEEE Trans. Speech Audio Process..

[2]  Mikko Kurimo,et al.  Importance of High-Order N-Gram Models in Morph-Based Speech Recognition , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Krzysztof Marasek,et al.  SPEECON – Speech Databases for Consumer Devices: Database Specification and Validation , 2002, LREC.

[4]  Paavo Alku,et al.  Noise Robust LVCSR feature extraction based on stabilized weighted linear prediction , 2009 .

[5]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[6]  Mikko Kurimo,et al.  Comparison of noise robust methods in large vocabulary speech recognition , 2010, 2010 18th European Signal Processing Conference.

[7]  Paavo Alku,et al.  Temporally Weighted Linear Prediction Features for Tackling Additive Noise in Speaker Verification , 2010, IEEE Signal Processing Letters.

[8]  Richard M. Stern,et al.  Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction , 2009, INTERSPEECH.

[9]  Yves Kamp,et al.  Robust signal selection for linear prediction analysis of voiced speech , 1993, Speech Commun..

[10]  Paavo Alku,et al.  Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions , 2010, INTERSPEECH.

[11]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[12]  Paavo Alku,et al.  Weighted linear prediction for speech analysis in noisy conditions , 2009, INTERSPEECH.

[13]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[14]  Mikko Kurimo,et al.  Duration modeling techniques for continuous speech recognition , 2004, INTERSPEECH.