论文信息 - Semi-supervised training of Gaussian mixture models by conditional entropy minimization

Semi-supervised training of Gaussian mixture models by conditional entropy minimization

In this paper, we propose a new semi-supervised training method for Gaussian Mixture Models. We add a conditional entropy minimizer to the maximum mutual information criteria, which enables to incorporate unlabeled data in a discriminative training fashion. The training method is simple but surprisingly effective. The preconditioned conjugate gradient method provides a reasonable convergence rate for parameter update. The phonetic classification experiments on the TIMIT corpus demonstrate significant improvements due to unlabeled data via our training criteria.

Mark Hasegawa-Johnson | Jui-Ting Huang | M. Hasegawa-Johnson | J. Huang

[1] Jonathan Le Roux,et al. Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[2] Xiao Li,et al. Discriminative training methods for language models using conditional entropy criteria , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[3] Yoshua Bengio,et al. Semi-supervised Learning by Entropy Minimization , 2004, CAP.

[4] Thorsten Joachims,et al. Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[5] J. Shewchuk. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain , 1994 .

[6] Dale Schuurmans,et al. Semi-Supervised Conditional Random Fields for Improved Sequence Segmentation and Labeling , 2006, ACL.

[7] Jean-Luc Gauvain,et al. Lightly supervised and unsupervised acoustic model training , 2002, Comput. Speech Lang..

[8] Mark J. F. Gales,et al. Unsupervised Training for Mandarin Broadcast News and Conversation Transcription , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[9] Jeff A. Bilmes,et al. The semi-supervised switchboard transcription project , 2009, INTERSPEECH.

[10] Hsiao-Wuen Hon,et al. Speaker-independent phone recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[11] Mark Hasegawa-Johnson,et al. Maximum mutual information estimation with unlabeled data for phonetic classification , 2008, INTERSPEECH.

[12] Dimitri P. Bertsekas,et al. Nonlinear Programming , 1997 .

[13] Andrew K. Halberstadt. Heterogeneous acoustic measurements and multiple classifiers for speech recognition , 1999 .

[14] Jonathan G. Fiscus,et al. Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST , 1993 .

[15] Alexander Zien,et al. Semi-Supervised Learning , 2006 .

[16] Hermann Ney,et al. Unsupervised training of acoustic models for large vocabulary continuous speech recognition , 2005, IEEE Transactions on Speech and Audio Processing.

[17] Jeff A. Bilmes,et al. On the semi-supervised learning of multi-layered perceptrons , 2009, INTERSPEECH.