Vowel classification using a neural predictive HMM: a discriminative training approach

A speech recognition system is developed utilising multi-layer perceptrons (MLPs) as speech-frame predictors. A Markov chain is used to control changes in the MLP's weight parameters. Analytical results and speech recognition experiments indicate that when joint (nonlinear/linear) prediction is performed within the hidden layer of the MLP, the model is better at capturing long-term data correlations which improves speech recognition performance. A discriminative training technique based on the maximum mutual information criterion is presented for training this class of models. The performance of the system on vowel classification tasks when trained with this method is shown to be superior to the same system trained using the maximum likelihood training criterion.<<ETX>>