Speech recognition using demi-syllable neural prediction model

The Neural Prediction Model is the speech recognition model based on pattern prediction by multilayer perceptrons. Its effectiveness was confirmed by the speaker-independent digit recognition experiments. This paper presents an improvement in the model and its application to large vocabulary speech recognition, based on subword units. The improvement involves an introduction of "backward prediction," which further improves the prediction accuracy of the original model with only "forward prediction". In application of the model to speaker-dependent large vocabulary speech recognition, the demi-syllable unit is used as a subword recognition unit. Experimental results indicated a 95.2% recognition accuracy for a 5000 word test set and the effectiveness was confirmed for the proposed model improvement and the demi-syllable subword units.

[1]  Esther Levin,et al.  Word recognition using hidden control neural architecture , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[2]  Alex Waibel,et al.  Large vocabulary recognition using linked predictive neural networks , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[3]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4]  Takao Watanabe,et al.  Large vocabulary word recognition based on demi-syllable hidden Markov model using small amount of training data , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[5]  Ken-ichi Iso,et al.  Speaker-independent word recognition using a neural prediction model , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[6]  Ken-ichi Iso,et al.  Large vocabulary speech recognition using neural prediction model , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.