Connectionist Approaches to the Use of Markov Models for Speech Recognition

Previous work has shown the ability of Multilayer Perceptrons (MLPs) to estimate emission probabilities for Hidden Markov Models (HMMs). The advantages of a speech recognition system incorporating both MLPs and HMMs are the best discrimination and the ability to incorporate multiple sources of evidence (features, temporal context) without restrictive assumptions of distributions or statistical independence. This paper presents results on the speaker-dependent portion of DARPA's English language Resource Management database. Results support the previously reported utility of MLP probability estimation for continuous speech recognition. An additional approach we are pursuing is to use MLPs as nonlinear predictors for autoregressive HMMs. While this is shown to be more compatible with the HMM formalism, it still suffers from several limitations. This approach is generalized to take account of time correlation between successive observations, without any restrictive assumptions about the driving noise.

[1]  Hervé Bourlard,et al.  Statistical Inference in Multilayer Perceptrons and Hidden Markov Models with Applications in Continuous Speech Recognition , 1989, NATO Neurocomputing.

[2]  H. Bourlard,et al.  Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  C. J. Wellekens,et al.  Explicit time correlation in hidden Markov models for speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Alex Waibel,et al.  Large vocabulary recognition using linked predictive neural networks , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[5]  Hervé Bourlard HOW CONNECTIONIST MODELS COULD IMPROVE MARKOV MODELS FOR SPEECH RECOGNITION , 1990 .

[6]  Hervé Bourlard,et al.  Continuous speech recognition on the resource management database using connectionist probability estimation , 1990, ICSLP.

[7]  Hervé Bourlard,et al.  Generalization and Parameter Estimation in Feedforward Netws: Some Experiments , 1989, NIPS.

[8]  Hy Murveit,et al.  1000-word speaker-independent continuous-speech recognition using hidden Markov models , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[9]  Hynek Hermansky,et al.  Continuous speech recognition using PLP analysis with multilayer perceptrons , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[10]  Patti Price,et al.  The DARPA 1000-word resource management database for continuous speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[11]  Hervé Bourlard,et al.  Continuous speech recognition using multilayer perceptrons with hidden Markov models , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[12]  Biing-Hwang Juang,et al.  Mixture autoregressive hidden Markov models for speech signals , 1985, IEEE Trans. Acoust. Speech Signal Process..