论文信息 - Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning - 字舞流文

Regularizing Structured Classifier with Conditional Probabilistic Constraints for Semi-supervised Learning

Constraints have been shown as an effective way to incorporate unlabeled data for semi-supervised structured classification. We recognize that, constraints are often conditional and probabilistic; moreover, a constraint can have its condition depend on either just observations (which we call x-type constraint) or even hidden variables (which we call y-type constraint). We wish to design a constraint formulation that can flexibly model the constraint probability for both x-type and y-type constraints, and later use it to regularize general structured classifiers for semi-supervision. Surprisingly, none of the existing models have such a constraint formulation. Thus in this paper, we propose a new conditional probabilistic formulation for modeling both x-type and y-type constraints. We also recognize the inference complication for y-type constraint, and propose a systematic selective evaluation approach to efficiently realize the constraints. Finally, we evaluate our model in three applications, including named entity recognition, part-of-speech tagging and entity information extraction, with totally nine data sets. We show that our model is generally more accurate and efficient than the state-of-the-art baselines. Our code and data are available at https://bitbucket.org/vwz/cikm2016-cpf/.

Kevin Chen-Chuan Chang | Vincent Wenchen Zheng | V. Zheng | K. Chang

[1] Sameer Singh,et al. Injecting Logical Background Knowledge into Embeddings for Relation Extraction , 2015, NAACL.

[2] Ben Taskar,et al. Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[3] Cícero Nogueira dos Santos,et al. Learning Character-level Representations for Part-of-Speech Tagging , 2014, ICML.

[4] Daphne Koller,et al. Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[5] Sabine Buchholz,et al. CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[6] Min Zhang,et al. Coupled Sequence Labeling on Heterogeneous Annotations: POS Tagging as a Case Study , 2015, ACL.

[7] Ming-Wei Chang,et al. Structured learning with constrained conditional models , 2012, Machine Learning.

[8] Eric P. Xing,et al. Grafting-light: fast, incremental feature selection and structure learning of Markov random fields , 2010, KDD '10.

[9] Ruslan Salakhutdinov,et al. Learning in Markov Random Fields using Tempered Transitions , 2009, NIPS.

[10] Matthew Richardson,et al. Markov logic networks , 2006, Machine Learning.

[11] Jan Kautz,et al. Fully-Connected CRFs with Non-Parametric Pairwise Potential , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12] André F. T. Martins. Transferring Coreference Resolvers with Posterior Regularization , 2015, ACL.

[13] Gideon S. Mann,et al. Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data , 2010, J. Mach. Learn. Res..

[14] Jun Zhu,et al. Robust RegBayes: Selectively Incorporating First-Order Logic Domain Knowledge into Bayesian Models , 2014, ICML.

[15] Miroslav Dudík,et al. Maximum Entropy Density Estimation with Generalized Regularization and an Application to Species Distribution Modeling , 2007, J. Mach. Learn. Res..

[16] Clare R. Voss,et al. ClusType: Effective Entity Recognition and Typing by Relation Phrase-Based Clustering , 2015, KDD.

[17] Partha Pratim Talukdar,et al. SCAD: collective discovery of attribute values , 2011, WWW.

[18] Christopher D. Manning,et al. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[19] Ben Taskar,et al. Wiki-ly Supervised Part-of-Speech Tagging , 2012, EMNLP.

[20] Xiaojin Zhu,et al. Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence A Framework for Incorporating General Domain Knowledge into Latent Dirichlet Allocation Using First-Order Logic , 2022 .

[21] Rahul Gupta,et al. Joint training for open-domain extraction on the web: exploiting overlap when supervision is limited , 2011, WSDM '11.

[22] Claire Cardie,et al. Context-aware Learning for Sentence-level Sentiment Analysis with Posterior Regularization , 2014, ACL.

[23] Andrew McCallum,et al. Learning Soft Linear Constraints with Application to Citation Field Extraction , 2014, ACL.

[24] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[25] Ayah Zirikly,et al. Cross-lingual Transfer of Named Entity Recognizers without Parallel Corpora , 2015, ACL.

[26] Jimeng Sun,et al. Incorporating Social Context and Domain Knowledge for Entity Recognition , 2015, WWW.

[27] Ben Taskar,et al. Graph-Based Posterior Regularization for Semi-Supervised Structured Prediction , 2013, CoNLL.

[28] GetoorLise,et al. Hinge-loss Markov random fields and probabilistic soft logic , 2017 .

[29] Jeff A. Bilmes,et al. Entropic Graph-based Posterior Regularization , 2015, ICML.

[30] Tom M. Mitchell,et al. Weakly Supervised Extraction of Computer Security Events from Twitter , 2015, WWW.

[31] Liang Tian,et al. Toward Better Chinese Word Segmentation for SMT via Bilingual Constraints , 2014, ACL.

[32] Vladlen Koltun,et al. Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[33] Andrew McCallum,et al. Alternating Projections for Learning with Expectation Constraints , 2009, UAI.