A Simple Label Switching Algorithm for Semisupervised Structural SVMs

In structured output learning, obtaining labeled data for real-world applications is usually costly, while unlabeled examples are available in abundance. Semisupervised structured classification deals with a small number of labeled examples and a large number of unlabeled structured data. In this work, we consider semisupervised structural support vector machines with domain constraints. The optimization problem, which in general is not convex, contains the loss terms associated with the labeled and unlabeled examples, along with the domain constraints. We propose a simple optimization approach that alternates between solving a supervised learning problem and a constraint matching problem. Solving the constraint matching problem is difficult for structured prediction, and we propose an efficient and effective label switching method to solve it. The alternating optimization is carried out within a deterministic annealing framework, which helps in effective constraint matching and avoiding poor local minima, which are not very useful. The algorithm is simple and easy to implement. Further, it is suitable for any structured output learning problem where exact inference is available. Experiments on benchmark sequence labeling data sets and a natural language parsing data set show that the proposed approach, though simple, achieves comparable generalization performance.

[1]  Chun-Nam Yu Transductive Learning of Structural SVMs via Prior Knowledge Constraints , 2012, AISTATS.

[2]  Dan Klein,et al.  Unsupervised Learning of Field Segmentation Models for Information Extraction , 2005, ACL.

[3]  S. Sathiya Keerthi,et al.  Extension of TSVM to Multi-Class and Hierarchical Text Classification Problems With General Losses , 2012, COLING.

[4]  Alexander Zien,et al.  Semi-Supervised Learning , 2006 .

[5]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[6]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[7]  Rohit J. Kate,et al.  Semi-Supervised Learning for Semantic Parsing using Support Vector Machines , 2007, NAACL.

[8]  S. Sathiya Keerthi,et al.  Deterministic Annealing for Semi-Supervised Structured Output Learning , 2012, AISTATS.

[9]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[10]  S. Sundararajan,et al.  A Sequential Dual Method for Structural SVMs , 2011, SDM.

[11]  Thorsten Joachims,et al.  Transductive Inference for Text Classification using Support Vector Machines , 1999, ICML.

[12]  Mikhail Belkin,et al.  Maximum Margin Semi-Supervised Learning for Structured Variables , 2005, NIPS 2005.

[13]  Ben Taskar,et al.  Posterior Regularization for Structured Latent Variable Models , 2010, J. Mach. Learn. Res..

[14]  Ming-Wei Chang,et al.  Guiding Semi-Supervision with Constraint-Driven Learning , 2007, ACL.

[15]  Bernhard Schölkopf,et al.  Introduction to Semi-Supervised Learning , 2006, Semi-Supervised Learning.

[16]  Kai-Wei Chang,et al.  Tractable Semi-supervised Learning of Complex Structured Prediction Models , 2013, ECML/PKDD.

[17]  Jeffrey D. Ullman,et al.  Introduction to automata theory, languages, and computation, 2nd edition , 2001, SIGA.

[18]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[19]  Eugene Charniak,et al.  Effective Self-Training for Parsing , 2006, NAACL.

[20]  守屋 悦朗,et al.  J.E.Hopcroft, J.D. Ullman 著, "Introduction to Automata Theory, Languages, and Computation", Addison-Wesley, A5変形版, X+418, \6,670, 1979 , 1980 .

[21]  Richard S. Zemel,et al.  High Order Regularization for Semi-Supervised Learning of Structured Output Problems , 2014, ICML.

[22]  Andrew McCallum,et al.  Alternating Projections for Learning with Expectation Constraints , 2009, UAI.

[23]  Alexander Zien,et al.  Transductive support vector machines for structured variables , 2007, ICML '07.

[24]  S. Sundararajan,et al.  An Empirical Evaluation of Sequence-Tagging Trainers , 2013, ArXiv.