论文信息 - Self-organization in mixture densities of HMM based speech recognition

Self-organization in mixture densities of HMM based speech recognition

In this paper experiments are presented to apply Self-Organizing Map (SOM) and Learning Vector Quantization (LVQ) for training mixture density hidden Markov models (HMMs) in automatic speech recognition. The decoding of spoken words into text is made using speaker dependent, but vocabulary and context independent phoneme HMMs. Each HMM has a set of states and the output density of each state is a unique mixture of the Gaussian densities. The mixture densities are trained by segmental versions of SOM and LVQ3. SOM is applied to initialize and smooth the mixture densities and LVQ3 to simply and robustly decrease recognition errors.

Mikko Kurimo

[1] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[2] Chin-Hui Lee,et al. Segmental GPD training of HMM based speech recognizer , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3] Teuvo Kohonen,et al. The 'neural' phonetic typewriter , 1988, Computer.

[4] Mikko Kurimo,et al. Training mixture density HMMs with SOM and LVQ , 1997, Comput. Speech Lang..

[5] M. Kurimo. Som Based Density Function Approximation for Mixture Density Hmms , 1997 .

[6] Kunio Nakajima,et al. An optimal discriminative training method for continuous mixture density HMMs , 1990, ICSLP.