Aspect Guided Text Categorization with Unobserved Labels

This paper proposes a novel multiclass classification method and exhibits its advantage in the domain of text categorization with a large label space and, most importantly, when some of the labels were not observed in the training data. The key insight is the introduction of intermediate aspect variables that encode properties of the labels. Aspect variables serve as a joint representation for observed and unobserved labels. This way the classification problem can be viewed as a structure learning problem with natural constraints on assignments to the aspect variables. We solve the problem as a constrained optimization problem over multiple learners and show significant improvement in classifying short sentences into a large label space of categories, including previously unobserved categories.

[1]  Thomas G. Dietterich,et al.  Solving Multiclass Learning Problems via Error-Correcting Output Codes , 1994, J. Artif. Intell. Res..

[2]  D. Roth 1 Global Inference for Entity and Relation Identification via a Linear Programming Formulation , 2007 .

[3]  Dan Roth,et al.  A Linear Programming Formulation for Global Inference in Natural Language Tasks , 2004, CoNLL.

[4]  Ming-Wei Chang Constraints as Prior Knowledge , 2008 .

[5]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[6]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[7]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[8]  Dan Roth,et al.  Learning and Inference over Constrained Output , 2005, IJCAI.

[9]  Dan Roth,et al.  Integer linear programming inference for conditional random fields , 2005, ICML.

[10]  Ming-Wei Chang,et al.  Guiding Semi-Supervision with Constraint-Driven Learning , 2007, ACL.

[11]  Dan Roth,et al.  The Importance of Syntactic Parsing and Inference in Semantic Role Labeling , 2008, CL.

[12]  Rakesh Gupta,et al.  Common Sense Data Acquisition for Indoor Mobile Robots , 2004, AAAI.

[13]  Ran El-Yaniv,et al.  Hierarchical Multiclass Decompositions with Application to Authorship Determination , 2010, ArXiv.

[14]  Pascal Denis,et al.  Joint Determination of Anaphoricity and Coreference Resolution using Integer Programming , 2007, NAACL.

[15]  Mirella Lapata,et al.  Aggregation via Set Partitioning for Natural Language Generation , 2006, NAACL.

[16]  Yoav Freund,et al.  Large Margin Classification Using the Perceptron Algorithm , 1998, COLT.

[17]  Kristina Toutanova Competitive generative models with structure learning for NLP classification tasks , 2006, EMNLP.