Hybrid training method for tied mixture density hidden Markov models using learning vector quantization and Viterbi estimation

In this work the output density functions of hidden Markov models (HMMs) are phoneme-wise tied mixture Gaussians. For training these tied mixture density HMMs, modified versions of the Viterbi training and learning vector quantisation (LVQ) based corrective tuning are described. The initialization of the mean vectors of the mixture Gaussians is performed by first composing small self-organising maps representing each phoneme and then combining them to a single large codebook to be trained by LVQ. The experiments on the proposed training methods are accomplished using a speech recognition system for Finnish phoneme sequences. Comparing to the corresponding continuous density and semi-continuous HMMs regarding the number of parameters, the recognition time and the average error rate, the performance of the phoneme-wise tied mixture HMMs is superior.<<ETX>>

[1]  E. McDermott,et al.  A hybrid speech recognition system using HMMs with an LVQ-trained codebook , 1990 .

[2]  H. Bourlard,et al.  Links Between Markov Models and Multilayer Perceptrons , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[4]  Kunio Nakajima,et al.  An optimal discriminative training method for continuous mixture density HMMs , 1990, ICSLP.

[5]  Mikko Kurimo Corrective tuning by applying LVQ for continuous density and semi-continuous Markov models , 1994, Proceedings of ICSIPNN '94. International Conference on Speech, Image Processing and Neural Networks.

[6]  Mikko Kurimo,et al.  Application of self-organizing maps and LVQ in training continuous density hidden Markov models for phonemes , 1992, ICSLP.

[7]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Mikko Kurimo,et al.  Status Report Of The Finnish Phonetic Typewriter Project , 1991 .

[9]  Mikko Kurimo,et al.  Using LVQ to enhance semi-continuous hidden Markov models for phonemes , 1993, EUROSPEECH.

[10]  Shigeki Sagayama,et al.  Minimum error classification training of HMMs , 1992 .

[11]  Jerome R. Bellegarda,et al.  Tied mixture continuous parameter modeling for speech recognition , 1990, IEEE Trans. Acoust. Speech Signal Process..

[12]  T. Kohonen,et al.  Appendix 2.4 Stopping Rule 2.3 Fine Tuning Using the Basic Lvq1 or Lvq2.1 Lvq Pak: a Program Package for the Correct Application of Learning Vector Quantization Algorithms , 1992 .

[13]  Shigeru Katagiri,et al.  A hybrid speech recognition system using HMMs with an LVQ-trained codebook , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[14]  David Rainton,et al.  Minimum error classification training of HMMs-Implementation details and experimental results.:Implementation details and experimental results , 1992 .

[15]  Jorma Laaksonen,et al.  LVQPAK: A software package for the correct application of Learning Vector Quantization algorithms , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.

[16]  Xuedong Huang,et al.  Semi-continuous hidden Markov models for speech signals , 1990 .