In this paper we present a weighted likelihood ratio (WLR) based hidden Markov model and apply it to speech recognition in noise. The WLR measure emphasizes spectral peaks than valleys in comparing two given speech spectra. The measure is more consistent with human perception of speech formants where natural resonances of vocal track are and tends to be more robust to broad-band noise interferences than other measures. A complete HMM framework of this measure is derived and a mixture of exponential kernels is used to model the output probability density function. The new WLR-HMM is tested on the Aurora2 connected digits database in noise. It shows more robust performance than the MFCC trained GMM baseline system. When combined with the dynamic cepstral features, the multiple-stream WLR-HMM shows a 39% relative improvement over the baseline system
[1]
David Pearce,et al.
Harmonic tunnelling: tracking non-stationary noises during speech
,
2001,
INTERSPEECH.
[2]
David Pearce,et al.
The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions
,
2000,
INTERSPEECH.
[3]
Phil D. Green,et al.
Robust automatic speech recognition with missing and unreliable acoustic data
,
2001,
Speech Commun..
[4]
杉山 雅英.
LPC spectral matching measures for speech recognition
,
1985
.
[5]
Chen Yang,et al.
Static and Dynamic Spectral Features: Their Noise Robustness and Optimal Weights for ASR
,
2005,
IEEE Transactions on Audio, Speech, and Language Processing.