Studies on Noise Robust Automatic Speech Recognition

Noise in everyday acoustic environments such as cars, traffic environments, and cafeterias remains one of the main challenges in automatic speech recognition (ASR). As a reserch theme, it has received wide attention in conferences and scientific journals focused on speech technology. This article collection reviews both the classic and novel approaches suggested for noise robust ASR. The articles are literature reviews written for the spring 2009 seminar course on noise robust automatic speech recognition (course code T-61.6060) held at TKK.

[1]  Bhaskar D. Rao,et al.  All-pole modeling of speech based on the minimum variance distortionless response spectrum , 2000, Conference Record of the Thirty-First Asilomar Conference on Signals, Systems and Computers (Cat. No.97CB36136).

[2]  Martin Cooke,et al.  A glimpsing model of speech perception in noise. , 2006, The Journal of the Acoustical Society of America.

[3]  Richard Lippmann,et al.  Speech recognition by machines and humans , 1997, Speech Commun..

[4]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[5]  Daniel P. W. Ellis,et al.  Speech and Audio Signal Processing - Processing and Perception of Speech and Music, Second Edition , 1999 .

[6]  Wonyong Sung,et al.  A statistical model-based voice activity detection , 1999, IEEE Signal Processing Letters.

[7]  Eric A. Lehmann,et al.  Particle Filter with Integrated Voice Activity Detection for Acoustic Source Tracking , 2007, EURASIP J. Adv. Signal Process..

[8]  Sven Nordholm,et al.  Real-Time Implementation of a Particle Filter with Integrated Voice Activity Detector for Acoustic Speaker Tracking , 2006, APCCAS 2006 - 2006 IEEE Asia Pacific Conference on Circuits and Systems.

[9]  Daniel P. W. Ellis,et al.  The auditory organization of speech and other sources in listeners and computational models , 2001, Speech Commun..

[10]  Bhaskar D. Rao,et al.  Robust Feature Extraction for Continuous Speech Recognition Using the MVDR Spectrum Estimation Method , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Paavo Alku,et al.  Stabilised weighted linear prediction , 2009, Speech Commun..

[12]  Joseph Picone,et al.  Benchmarking human performance for continuous speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[13]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[14]  John H. L. Hansen,et al.  A new perceptually motivated MVDR-based acoustic front-end (PMVDR) for robust automatic speech recognition , 2008, Speech Commun..

[15]  Richard M. Stern,et al.  A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition , 2004, Speech Commun..