Efficient Learning for Discriminative Segmentation with Supermodular Losses

Several supermodular losses have been shown to improve the perceptual quality of image segmentation in a discriminative framework such as a structured output support vector machine (SVM). These loss functions do not necessarily have the same structure as the segmentation inference algorithm, and in general, we may have to resort to generic submodular minimization algorithms for loss augmented inference. Although these come with polynomial time guarantees, they are not practical to apply to image scale data. Many supermodular losses come with strong optimization guarantees, but are not readily incorporated in a loss augmented graph cuts procedure. This motivates our strategy of employing the alternating direction method of multipliers (ADMM) decomposition for loss augmented inference. In doing so, we create a new API for the structured SVM that separates the maximum a posteriori (MAP) inference of the model from the loss augmentation during training. In this way, we gain computational efficiency, making new choices of loss functions practical for the first time, while simultaneously making the inference algorithm employed during training closer to the test time procedure. We show improvement both in accuracy and computational performance on the Microsoft Research Grabcut database and a brain structure segmentation task, empirically validating the use of a supermodular loss during training, and the improved computational properties of the proposed ADMM approach over the Fujishige-Wolfe minimum norm point algorithm.

[1]  Vladimir Kolmogorov,et al.  Optimizing Binary MRFs via Extended Roof Duality , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Alexander Schrijver,et al.  Combinatorial optimization. Polyhedra and efficiency. , 2003 .

[3]  Patrick Pérez,et al.  Interactive Image Segmentation Using an Adaptive GMMRF Model , 2004, ECCV.

[4]  Ben Taskar,et al.  Discriminative learning of Markov random fields for segmentation of 3D scan data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Satoru Fujishige,et al.  Lexicographically Optimal Base of a Polymatroid with Respect to a Weight Vector , 1980, Math. Oper. Res..

[6]  Vladimir Kolmogorov,et al.  Minimizing a sum of submodular functions , 2010, Discret. Appl. Math..

[7]  Pravesh Kothari,et al.  Provable Submodular Minimization using Wolfe's Algorithm , 2014, NIPS.

[8]  Derek Hoiem,et al.  Learning CRFs Using Graph Cuts , 2008, ECCV.

[9]  Andreas Krause,et al.  SFO: A Toolbox for Submodular Function Optimization , 2010, J. Mach. Learn. Res..

[10]  Wojciech Zaremba,et al.  Discriminative training of CRF models with probably submodular constraints , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[11]  Guillaume Charpiat,et al.  Exhaustive family of energies minimizable exactly by a graph cut , 2011, CVPR 2011.

[12]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Richard S. Zemel,et al.  HOP-MAP: Efficient Message Passing with High Order Potentials , 2010, AISTATS.

[14]  Andrew Blake,et al.  Geodesic star convexity for interactive image segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  James B. Orlin,et al.  A faster strongly polynomial time algorithm for submodular function minimization , 2007, Math. Program..

[16]  Nikos Komodakis,et al.  MRF Optimization via Dual Decomposition: Message-Passing Revisited , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[17]  C. SIAMJ. A FASTER SCALING ALGORITHM FOR MINIMIZING SUBMODULAR FUNCTIONS∗ , 2001 .

[18]  Maurice Queyranne,et al.  Minimizing symmetric submodular functions , 1998, Math. Program..

[19]  Satoru Fujishige,et al.  Submodular functions and optimization , 1991 .

[20]  Tamir Hazan,et al.  Efficient Training of Structured SVMs via Soft Constraints , 2015, AISTATS.

[21]  Richard S. Zemel,et al.  Structured Output Learning with High Order Loss Functions , 2012, AISTATS.

[22]  Torsten Rohlfing,et al.  Image Similarity and Tissue Overlaps as Surrogates for Image Registration Accuracy: Widely Used but Unreliable , 2012, IEEE Transactions on Medical Imaging.

[23]  Pushmeet Kohli,et al.  Learning Low-order Models for Enforcing High-order Statistics , 2012, AISTATS.

[24]  Nikos Paragios,et al.  Discrete Multi Atlas Segmentation using Agreement Constraints , 2014, BMVC.

[25]  Stefano Soatto,et al.  Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Ben Taskar,et al.  Max-Margin Markov Networks , 2003, NIPS.

[27]  Michael I. Jordan,et al.  On the Convergence Rate of Decomposable Submodular Function Minimization , 2014, NIPS.

[28]  Dimitri P. Bertsekas,et al.  Nonlinear Programming , 1997 .

[29]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[30]  Pushmeet Kohli,et al.  Perceptually Inspired Layout-Aware Losses for Image Segmentation , 2014, ECCV.

[31]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[32]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Sebastian Nowozin,et al.  Optimal Decisions from Probabilistic Models: The Intersection-over-Union Case , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[35]  Christoph H. Lampert,et al.  Learning to Localize Objects with Structured Output Regression , 2008, ECCV.

[36]  Sebastian Nowozin,et al.  Structured Learning and Prediction in Computer Vision , 2011, Found. Trends Comput. Graph. Vis..

[37]  Andreas Krause,et al.  Efficient Minimization of Decomposable Submodular Functions , 2010, NIPS.

[38]  Thorsten Joachims,et al.  Cutting-plane training of structural SVMs , 2009, Machine Learning.

[39]  Matthew B. Blaschko,et al.  Learning Submodular Losses with the Lovasz Hinge , 2015, ICML.