Marginal Structured SVM with Hidden Variables

In this work, we propose the marginal structured SVM (MSSVM) for structured prediction with hidden variables. MSSVM properly accounts for the uncertainty of hidden variables, and can significantly outperform the previously proposed latent structured SVM (LSSVM; Yu & Joachims (2009)) and other state-of-art methods, especially when that uncertainty is large. Our method also results in a smoother objective function, making gradient-based optimization of MSSVMs converge significantly faster than for LSSVMs. We also show that our method consistently outperforms hidden conditional random fields (HCRFs; Quattoni et al. (2007)) on both simulated and real-world datasets. Furthermore, we propose a unified framework that includes both our and several other existing methods as special cases, and provides insights into the comparison of different models in practice.

[1]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[2]  Xiaolong Wang,et al.  Protein-protein interaction site prediction based on conditional random fields , 2007, Bioinform..

[3]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[4]  Tamir Hazan,et al.  A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction , 2010, NIPS.

[5]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[6]  Trevor Darrell,et al.  Conditional Random Fields for Object Recognition , 2004, NIPS.

[7]  Nathan Ratliff,et al.  Online) Subgradient Methods for Structured Prediction , 2007 .

[8]  Trevor Darrell,et al.  Hidden Conditional Random Fields for Gesture Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  William T. Freeman,et al.  Removing camera shake from a single photograph , 2006, SIGGRAPH 2006.

[10]  J. Andrew Bagnell,et al.  (Approximate) Subgradient Methods for Structured Prediction , 2007, International Conference on Artificial Intelligence and Statistics.

[11]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[12]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[13]  Yang Wang,et al.  Max-margin hidden conditional random fields for human action recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  S. Sathiya Keerthi,et al.  Deterministic Annealing for Semi-Supervised Structured Output Learning , 2012, AISTATS.

[15]  Frédo Durand,et al.  Efficient marginal likelihood optimization in blind deconvolution , 2011, CVPR 2011.

[16]  William T. Freeman,et al.  Latent hierarchical structural learning for object detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Hugo Larochelle,et al.  Loss-sensitive Training of Probabilistic Conditional Random Fields , 2011, ArXiv.

[18]  Ming-Wei Chang,et al.  Unified Expectation Maximization , 2012, NAACL.

[19]  David A. Smith,et al.  Improving NLP through Marginalization of Hidden Syntactic Structure , 2012, EMNLP-CoNLL.

[20]  Sebastian Nowozin,et al.  Structured Prediction and Learning in Computer Vision , 2011 .

[21]  Qiang Liu,et al.  Variational algorithms for marginal MAP , 2011, J. Mach. Learn. Res..

[22]  I JordanMichael,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008 .

[23]  B. Triggs,et al.  Scene segmentation with Conditional Random Fields learned from partially labeled images , 2007, NIPS 2007.

[24]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[25]  Ye Xu,et al.  Hyperlink Prediction in Hypernetworks Using Latent Social Features , 2013, Discovery Science.

[26]  Bill Triggs,et al.  Scene Segmentation with CRFs Learned from Partially Labeled Images , 2007, NIPS.

[27]  Yasubumi Sakakibara,et al.  RNA secondary structural alignment with conditional random fields , 2005, ECCB/JBI.

[28]  Ben Taskar,et al.  Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .

[29]  Daphne Koller,et al.  Modeling Latent Variable Uncertainty for Loss-based Learning , 2012, ICML.

[30]  Sebastian Nowozin,et al.  Structured Learning and Prediction in Computer Vision , 2011, Found. Trends Comput. Graph. Vis..

[31]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[32]  Claire Cardie,et al.  Multi-Level Structured Models for Document-Level Sentiment Classification , 2010, EMNLP.

[33]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[34]  Antonio Criminisi,et al.  Object categorization by learned universal visual dictionary , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[35]  Kevin Miller,et al.  Max-Margin Min-Entropy Models , 2012, AISTATS.

[36]  Marc Pollefeys,et al.  Efficient Structured Prediction with Latent Variables for General Graphical Models , 2012, ICML.

[37]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.