Probability estimation by feed-forward networks in continuous speech recognition

The authors review the use of feedforward neural networks as estimators of probability densities in hidden Markov modelling. In this paper, they are mostly concerned with radial basis functions (RBF) networks. They not the isomorphism of RBF networks to tied mixture density estimators; additionally they note that RBF networks are trained to estimate posteriors rather than the likelihoods estimated by tied mixture density estimators. They show how the neural network training should be modified to resolve this mismatch. They also discuss problems with discriminative training, particularly the problem of dealing with unlabelled training data and the mismatch between model and data priors.<<ETX>>

[1]  H. Bourlard,et al.  Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Peter F. Brown,et al.  The acoustic-modeling problem in automatic speech recognition , 1987 .

[3]  Xuedong Huang,et al.  Semi-continuous hidden Markov models for speech signals , 1990 .

[4]  Jerome R. Bellegarda,et al.  Tied mixture continuous parameter modeling for speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[5]  James K. Baker,et al.  On the interaction between true source, training, and testing language models , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[6]  A. Waibel,et al.  Connectionist Viterbi training: a new hybrid method for continuous speech recognition , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[7]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[8]  J. S. Bridle,et al.  An Alphanet approach to optimising input transformations for continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[9]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yoshua Bengio,et al.  Global optimization of a neural network-hidden Markov model hybrid , 1992, IEEE Trans. Neural Networks.

[12]  Fergus McInnes,et al.  A comparative study of continuous speech recognition using neural networks and hidden Markov models , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[13]  Hynek Hermansky,et al.  Continuous speech recognition using PLP analysis with multilayer perceptrons , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[14]  John S. Bridle,et al.  Alpha-nets: A recurrent 'neural' network architecture with a hidden Markov model interpretation , 1990, Speech Commun..

[15]  M. J. D. Powell,et al.  Radial basis functions for multivariable interpolation: a review , 1987 .

[16]  Hervé Bourlard,et al.  A Continuous Speech Recognition System Embedding MLP into HMM , 1989, NIPS.

[17]  John S. Bridle,et al.  Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters , 1989, NIPS.