Efficient Structured Prediction with Latent Variables for General Graphical Models

In this paper we propose a unified framework for structured prediction with latent variables which includes hidden conditional random fields and latent structured support vector machines as special cases. We describe a local entropy approximation for this general formulation using duality, and derive an efficient message passing algorithm that is guaranteed to converge. We demonstrate its effectiveness in the tasks of image segmentation as well as 3D indoor scene understanding from single images, showing that our approach is superior to latent structured support vector machines and hidden conditional random fields.

[1]  Alexei A. Efros,et al.  Recovering Surface Layout from an Image , 2007, International Journal of Computer Vision.

[2]  Marc Pollefeys,et al.  Distributed message passing for large scale graphical models , 2011, CVPR 2011.

[3]  Alan L. Yuille,et al.  The Concave-Convex Procedure , 2003, Neural Computation.

[4]  Max Welling,et al.  Hidden-Unit Conditional Random Fields , 2011, AISTATS.

[5]  Trevor Darrell,et al.  Hidden-state Conditional Random Fields , 2006 .

[6]  Nikos Komodakis Learning to cluster using high order graphical models with latent variables , 2011, 2011 International Conference on Computer Vision.

[7]  Stephen Gould,et al.  Discriminative learning with latent variables for cluttered indoor scene understanding , 2010, CACM.

[8]  Thorsten Joachims,et al.  Training structural SVMs when exact inference is intractable , 2008, ICML '08.

[9]  Fernando Pereira,et al.  Structured Learning with Approximate Inference , 2007, NIPS.

[10]  Tamir Hazan,et al.  A Primal-Dual Message-Passing Algorithm for Approximated Large Scale Structured Prediction , 2010, NIPS.

[11]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[12]  Richard S. Zemel,et al.  Structured Output Learning with High Order Loss Functions , 2012, AISTATS.

[13]  Tamir Hazan,et al.  Norm-Product Belief Propagation: Primal-Dual Message-Passing for Approximate Inference , 2009, IEEE Transactions on Information Theory.

[14]  Takeo Kanade,et al.  Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces , 2010, NIPS.

[15]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[16]  Joachim M. Buhmann,et al.  Entropy and Margin Maximization for Structured Output Learning , 2010, ECML/PKDD.

[17]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[18]  David A. McAllester,et al.  Object Detection with Grammar Models , 2011, NIPS.

[19]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[20]  Thorsten Joachims,et al.  Learning structural SVMs with latent variables , 2009, ICML '09.

[21]  Alan L. Yuille,et al.  The Concave-Convex Procedure (CCCP) , 2001, NIPS.

[22]  Derek Hoiem,et al.  Recovering the spatial layout of cluttered rooms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Gert R. G. Lanckriet,et al.  On the Convergence of the Concave-Convex Procedure , 2009, NIPS.

[24]  T. Kanade,et al.  Geometric reasoning for single image structure recovery , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Ben Taskar,et al.  Learning structured prediction models: a large margin approach , 2005, ICML.

[26]  Trevor Darrell,et al.  Hidden Conditional Random Fields , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Marc Pollefeys,et al.  Efficient structured prediction for 3D indoor scene understanding , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.