Phoneme modelling using continuous mixture densities

Deals with the use of continuous mixture densities for phenome modelling in large vocabulary continuous speech recognition. The concept of continuous mixture densities is applied to the emission probability density functions of hidden Markov models for phonemes in order to take into account phonetic-context dependencies. It is shown that the advantage of continuous mixture densities is the ability to lead to parameter estimates that are accurate and at the same time robust with respect to the limited amount of training data. Training and recognition algorithms for mixture densities in the framework of phoneme modelling are described. Recognition results for a 917-word task, requiring only 7 min of speech for training and an overlap of 43 words between training vocabulary and test vocabulary, are presented.<<ETX>>

[1]  Hermann Ney,et al.  The use of a one-stage dynamic programming algorithm for connected word recognition , 1984 .

[2]  S.E. Levinson,et al.  Structural methods in automatic speech recognition , 1985, Proceedings of the IEEE.

[3]  L. R. Rabiner,et al.  Recognition of isolated digits using hidden Markov models with continuous mixture densities , 1985, AT&T Technical Journal.

[4]  Hermann Ney,et al.  On the automatic training of phonetic units for word recognition , 1986, IEEE Trans. Acoust. Speech Signal Process..

[5]  Andreas Noll,et al.  A data-driven organization of the dynamic programming beam search for continuous speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  Hermann Ney,et al.  Training of phoneme models in a sentence recognition system , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.