Large margin training for hidden Markov models with partially observed states

Large margin learning of Continuous Density HMMs with a partially labeled dataset has been extensively studied in the speech and handwriting recognition fields. Yet due to the non-convexity of the optimization problem, previous works usually rely on severe approximations so that it is still an open problem. We propose a new learning algorithm that relies on non-convex optimization and bundle methods and allows tackling the original optimization problem as is. It is proved to converge to a solution with accuracy ε with a rate O (1/ε). We provide experimental results gained on speech and handwriting recognition that demonstrate the potential of the method.

[1]  Hui Jiang,et al.  Incorporating Training Errors for Large Margin HMMS Under Semi-Definite Programming Framework , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[2]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[3]  J. Hiriart-Urruty,et al.  Convex analysis and minimization algorithms , 1993 .

[4]  Daniel Povey,et al.  Large scale discriminative training of hidden Markov models for speech recognition , 2002, Comput. Speech Lang..

[5]  K. Kiwiel Methods of Descent for Nondifferentiable Optimization , 1985 .

[6]  Lawrence K. Saul,et al.  Large Margin Hidden Markov Models for Automatic Speech Recognition , 2006, NIPS.

[7]  S. Katagiri,et al.  Discriminative Learning for Minimum Error Classification , 2009 .

[8]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[9]  Alexander J. Smola,et al.  A scalable modular convex solver for regularized risk minimization , 2007, KDD '07.

[10]  Dong Yu,et al.  Large-Margin Discriminative Training of Hidden Markov Models for Speech Recognition , 2007, International Conference on Semantic Computing (ICSC 2007).

[11]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[12]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[13]  Lawrence K. Saul,et al.  Large margin training of acoustic models for speech recognition , 2007 .