论文信息 - Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition

Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition

This paper investigates the effectiveness of the DAEM (Deterministic Annealing EM) algorithm in acoustic modeling for speaker and speech recognition. Although the EM algorithm has been widely used to approximate the ML estimates, it has the problem of initialization dependence. To relax this problem, the DAEM algorithm has been proposed and confirmed the effectiveness in artificial small tasks. In this paper, we applied the DAEM algorithm to practical speech recognition tasks: speaker recognition based on GMMs and continuous speech recognition based on HMMs. Experimental results show that the DAEM algorithm can improve the recognition performance as compared to the standard EM algorithm with conventional initialization algorithms, especially in the flat start training for continuous speech recognition.

[1] Biing-Hwang Juang,et al. Hidden Markov Models for Speech Recognition , 1991 .

[2] Douglas A. Reynolds,et al. Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[3] Geoffrey C. Fox,et al. Vector quantization by deterministic annealing , 1992, IEEE Trans. Inf. Theory.

[4] Kenneth Rose,et al. Deterministically annealed design of hidden Markov model speech recognizers , 2001, IEEE Trans. Speech Audio Process..

[5] Naonori Ueda,et al. EM algorithm with split and merge operations for mixture models , 2000, Systems and Computers in Japan.

[6] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7] Koichi Shinoda,et al. MDL-based context-dependent subword modeling for speech recognition , 2000 .

[8] 宮島千代美. Discriminative Training for System Module Integration in Speaker and Speech Recognition , 2001 .

[9] Kenneth Rose,et al. Deterministic annealing for trellis quantizer and HMM design using Baum-Welch re-estimation , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[10] Naonori Ueda,et al. Deterministic annealing EM algorithm , 1998, Neural Networks.

[11] Jj Odell,et al. The Use of Context in Large Vocabulary Speech Recognition , 1995 .