论文信息 - Constraint-Driven Training of Complex Models Using MCMC

Constraint-Driven Training of Complex Models Using MCMC

Standard machine learning approaches require labeled data, and labeling data for each task, language, and domain of interest is not feasible. Consequently, there has been much interest in developing training algorithms that can leverage constraints from prior knowledge to augment or replace labeled data. Most previous work in this area assumes that there exist efficient inference algorithms for the model being trained. For many NLP tasks of interest, such as entity resolution, complex models that require approximate inference are advantageous. In this paper we study algorithms for training complex models using constraints from prior knowledge. We propose an MCMC-based approximation to Generalized Expectation (GE) training, and compare it to Constraint-Driven SampleRank (CDSR). Sequence labeling experiments demonstrate that MCMC GE closely approximates exact GE, and that GE can substantially outperform CDSR. We then apply these methods to train densely-connected citation resolution models. Both methods yield highly accurate models (up to 94% mean pairwise F1) with only two simple constraints.

A. McCallum | Gregory Druck | Sameer Singh

[1] Andrew McCallum,et al. An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[2] Andrew McCallum,et al. SampleRank: Training Factor Graphs with Atomic Gradients , 2011, ICML.

[3] Mark Johnson,et al. Using Universal Linguistic Knowledge to Guide Grammar Induction , 2010, EMNLP.

[4] Andrew McCallum,et al. Constraint-Driven Rank-Based Learning for Information Extraction , 2010, HLT-NAACL.

[5] Ben Taskar,et al. Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[6] Andrew McCallum,et al. Active Learning by Labeling Features , 2009, EMNLP.

[7] Andrew McCallum,et al. Alternating Projections for Learning with Expectation Constraints , 2009, UAI.

[8] Dan Roth,et al. Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[9] Pedro M. Domingos,et al. Joint Unsupervised Coreference Resolution with Markov Logic , 2008, EMNLP.

[10] Andrew McCallum,et al. A unified approach for schema matching, coreference and canonicalization , 2008, KDD.

[11] Gideon S. Mann,et al. Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields , 2008, ACL.