Structured Learning from Partial Annotations

Structured learning is appropriate when predicting structured outputs such as trees, graphs, or sequences. Most prior work requires the training set to consist of complete trees, graphs or sequences. Specifying such detailed ground truth can be tedious or infeasible for large outputs. Our main contribution is a large margin formulation that makes structured learning from only partially annotated data possible. The resulting optimization problem is non-convex, yet can be efficiently solve by concave-convex procedure (CCCP) with novel speedup strategies. We apply our method to a challenging tracking-by-assignment problem of a variable number of divisible objects. On this benchmark, using only 25% of a full annotation we achieve a performance comparable to a model learned with a full annotation. Finally, we offer a unifying perspective of previous work using the hinge, ramp, or max loss for structured learning, followed by an empirical comparison on their practical performance.

[1]  David A. McAllester,et al.  Generalization bounds and consistency for latent-structural probit and ramp loss , 2011, MLSLP.

[2]  Thomas Hofmann,et al.  Learning to Model Spatial Dependency: Semi-Supervised Discriminative Random Fields , 2007 .

[3]  Tommi S. Jaakkola,et al.  More data means less inference: A pseudo-max approach to structured learning , 2010, NIPS.

[4]  David A. McAllester,et al.  Object Detection with Grammar Models , 2011, NIPS.

[5]  William T. Freeman,et al.  Latent hierarchical structural learning for object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[7]  Ben Taskar,et al.  Learning from Partial Labels , 2011, J. Mach. Learn. Res..

[8]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[9]  Francesco Orabona,et al.  Learning from Candidate Labeling Sets , 2010, NIPS.

[10]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[11]  Mikhail Belkin,et al.  Maximum Margin Semi-Supervised Learning for Structured Variables , 2005, NIPS 2005.

[12]  Tamir Hazan,et al.  Direct Loss Minimization for Structured Prediction , 2010, NIPS.

[13]  Andrew Zisserman,et al.  Structured output regression for detection with partial truncation , 2009, NIPS.

[14]  Gökhan BakIr,et al.  Predicting Structured Data , 2008 .

[15]  Eraldo Rezende Fernandes,et al.  Learning from Partially Annotated Sequences , 2011, ECML/PKDD.

[16]  Alexander J. Smola,et al.  Bundle Methods for Regularized Risk Minimization , 2010, J. Mach. Learn. Res..

[17]  Tong Zhang,et al.  Statistical Analysis of Some Multi-Category Large Margin Classification Methods , 2004, J. Mach. Learn. Res..

[18]  Krzysztof C. Kiwiel,et al.  Proximity control in bundle methods for convex nondifferentiable minimization , 1990, Math. Program..

[19]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[20]  Alexander J. Smola,et al.  Tighter Bounds for Structured Estimation , 2008, NIPS.

[21]  Alexander Zien,et al.  Transductive support vector machines for structured variables , 2007, ICML '07.

[22]  Yang Wang,et al.  A Discriminative Latent Model of Object Classes and Attributes , 2010, ECCV.

[23]  Dale Schuurmans,et al.  Discriminative unsupervised learning of structured predictors , 2006, ICML.

[24]  Fred A. Hamprecht,et al.  Structured Learning for Cell Tracking , 2011, NIPS.