Closed-Form Approximate CRF Training for Scalable Image Segmentation

We present LS-CRF, a new method for training cyclic Conditional Random Fields (CRFs) from large datasets that is inspired by classical closed-form expressions for the maximum likelihood parameters of a generative graphical model with tree topology. Training a CRF with LS-CRF requires only solving a set of independent regression problems, each of which can be solved efficiently in closed form or by an iterative solver. This makes LS-CRF orders of magnitude faster than classical CRF training based on probabilistic inference, and at the same time more flexible and easier to implement than other approximate techniques, such as pseudolikelihood or piecewise training. We apply LS-CRF to the task of semantic image segmentation, showing that it achieves on par accuracy to other training techniques at higher speed, thereby allowing efficient CRF training from very large training sets. For example, training a linearly parameterized pairwise CRF on 150,000 images requires less than one hour on a modern workstation.

[1]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Antonio Criminisi,et al.  Object Class Segmentation using Random Forests , 2008, BMVC.

[3]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[4]  Tommi S. Jaakkola,et al.  More data means less inference: A pseudo-max approach to structured learning , 2010, NIPS.

[5]  Andrei A. Bulatov,et al.  The complexity of partition functions , 2005, Theor. Comput. Sci..

[6]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[7]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[8]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[9]  J. Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[10]  Ilya Trofimov,et al.  Using boosted trees for click-through rate prediction for sponsored search , 2012, ADKDD '12.

[11]  Justin Domke,et al.  Structured Learning via Logistic Regression , 2013, NIPS.

[12]  Jörg H. Kappes,et al.  OpenGM: A C++ Library for Discrete Graphical Models , 2012, ArXiv.

[13]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Matthieu Guillaumin,et al.  ImageNet Auto-Annotation with Segmentation Propagation , 2014, International Journal of Computer Vision.

[15]  Justin Domke,et al.  Learning Graphical Model Parameters with Approximate Marginal Inference , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Thorsten Joachims,et al.  Training structural SVMs when exact inference is intractable , 2008, ICML '08.

[17]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Sebastian Nowozin,et al.  Structured Learning and Prediction in Computer Vision , 2011, Found. Trends Comput. Graph. Vis..

[19]  Stephen Gould,et al.  PatchMatchGraph: Building a Graph of Dense Patch Correspondences for Label Transfer , 2012, ECCV.

[20]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[21]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[22]  Nikos Komodakis,et al.  Efficient training for pairwise or higher order CRFs via dual decomposition , 2011, CVPR 2011.

[23]  Vladimir Kolmogorov,et al.  On partial optimality in multi-label MRFs , 2008, ICML '08.

[24]  Svetlana Lazebnik,et al.  Superparsing - Scalable Nonparametric Image Parsing with Superpixels , 2010, International Journal of Computer Vision.

[25]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[26]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[27]  Andrew McCallum,et al.  Piecewise Training for Undirected Models , 2005, UAI.

[28]  Stephen Gould,et al.  Multiclass pixel labeling with non-local matching constraints , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Matthieu Guillaumin,et al.  Segmentation Propagation in ImageNet , 2012, ECCV.

[30]  Stefano Soatto,et al.  Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31]  Tommi S. Jaakkola,et al.  Learning Efficiently with Approximate Inference via Dual Losses , 2010, ICML.

[32]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Andrew Zisserman,et al.  Pylon Model for Semantic Segmentation , 2011, NIPS.

[34]  Léon Bottou,et al.  The Tradeoffs of Large Scale Learning , 2007, NIPS.

[35]  Sebastian Nowozin,et al.  Decision tree fields , 2011, 2011 International Conference on Computer Vision.