论文信息 - Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions

Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions

We describe a method of incorporating task-specific cost functions into standard conditional log-likelihood (CLL) training of linear structured prediction models. Recently introduced in the speech recognition community, we describe the method generally for structured models, highlight connections to CLL and max-margin learning for structured prediction (Taskar et al., 2003), and show that the method optimizes a bound on risk. The approach is simple, efficient, and easy to implement, requiring very little change to an existing CLL implementation. We present experimental results comparing with several commonly-used methods for training structured predictors for named-entity recognition.

Noah A. Smith | Kevin Gimpel | Kevin Gimpel

[1] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[2] Zdravko Kacic,et al. A novel loss function for the overall risk criterion based discriminative training of HMM models , 2000, INTERSPEECH.

[3] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[4] Daniel Povey,et al. Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5] Yee Whye Teh,et al. An Alternate Objective Function for Markovian Fields , 2002, ICML.

[6] Michael Collins,et al. Ranking Algorithms for Named Entity Extraction: Boosting and the VotedPerceptron , 2002, ACL.

[7] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[8] Ben Taskar,et al. Max-Margin Markov Networks , 2003, NIPS.

[9] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[10] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[11] Koby Crammer,et al. Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..