Use of incrementally regulated discriminative margins in MCE training for speech recognition

In this paper, w e report our recent development of a novel discriminative learning techni q ue w hich embeds the concept of discriminative margin into the w ell established minimum classification error ( MCE ) method . The idea is to impose an incrementally ad j usted “ margin ” in the loss function of MCE algorithm so that not only error rates are minimi z ed but also discrimination “ robustness ” bet w een training and test sets is maintained . Experimental evaluation sho w s that the use of the margin improves a state - of - the - art MCE method by reducing 17% digit errors and 19% string errors in the TIDigits recognition tas k. The string error rate of 0.55% and digit error rate of 0.19% w e have obtained are the best - ever results reported on this tas k in the literature . .

[1]  Wu Chou,et al.  A Novel Learning Method for Hidden Markov Models in Speech and Audio Processing , 2006, 2006 IEEE Workshop on Multimedia Signal Processing.

[2]  Lawrence K. Saul,et al.  Large Margin Gaussian Mixture Modeling for Phonetic Classification and Recognition , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[3]  Geoffrey Zweig,et al.  fMPE: discriminatively trained features for speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[4]  Hui Jiang,et al.  A constrained joint optimization method for large margin HMM estimation , 2005, IEEE Workshop on Automatic Speech Recognition and Understanding, 2005..

[5]  Fernando Pereira Linear models for structure prediction , 2005, INTERSPEECH.

[6]  Peter L. Bartlett,et al.  Improved Generalization Through Explicit Optimization of Margins , 2000, Machine Learning.

[7]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[8]  Wu Chou,et al.  Minimum classification error linear regression for acoustic model adaptation of continuous density HMMs , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[9]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..

[10]  Erik McDermott,et al.  Discriminative Training for Speech Recognition , 1997 .

[11]  Shigeru Katagiri,et al.  Prototype-based discriminative training for various speech units , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  R. G. Leonard,et al.  A database for speaker-independent digit recognition , 1984, ICASSP.