A new class of fenonic Markov word models for large vocabulary continuous speech recognition

A technique for constructing hidden Markov models for the acoustic representation of words is described. The models, built from combinations of acoustically based subword units called fenones, are derived automatically from one or more sample utterances of words. They are more flexible than previously reported fenone-based word models and lead to an improved capability of modeling variations in pronunciation. In addition, their construction is simplified, because it can be done using the well-known forward-backward algorithm for the parameter estimation of hidden Markov models. Experimental results obtained on a 5000-word vocabulary continuous speech recognition task are presented to illustrate some of the benefits associated with the new models. Multonic baseforms resulted in a reduction of 16% in the average error rate obtained for ten speakers.<<ETX>>

[1]  V. Zue,et al.  The role of phonological rules in speech understanding research , 1975 .

[2]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  John Makhoul,et al.  Context-dependent modeling for acoustic-phonetic recognition of continuous speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Lalit R. Bahl,et al.  Automatic construction of acoustic markov models for words , 1987 .

[5]  Michael Picheny,et al.  Acoustic Markov models used in the Tangora speech recognition system , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[6]  Michael Picheny,et al.  Large vocabulary natural language continuous speech recognition , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[7]  Jerome R. Bellegarda,et al.  Tied mixture continuous parameter modeling for speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[8]  Kai-Fu Lee,et al.  Context-independent phonetic hidden Markov models for speaker-independent continuous speech recognition , 1990 .

[9]  Michael Picheny,et al.  Decision trees for phonological rules in continuous speech , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.