Midrange Geometric Interactions for Semantic Segmentation Constraints for Continuous Multi-label Optimization

In this article we introduce the concept of midrange geometric constraints into semantic segmentation. We call these constraints ‘midrange’ since they are neither global constraints, which take into account all pixels without any spatial limitation, nor are they local constraints, which only regard single pixels or pairwise relations. Instead, the proposed constraints allow to discourage the occurrence of labels in the vicinity of each other, e.g., ‘wolf’ and ‘sheep’. ‘Vicinity’ encompasses spatial distance as well as specific spatial directions simultaneously, e.g., ‘plates’ are found directly above ‘tables’, but do not fly over them. It is up to the user to specifically define the spatial extent of the constraint between each two labels. Such constraints are not only interesting for scene segmentation, but also for partbased articulated or rigid objects. The reason is that object parts such as for example arms, torso and legs usually obey specific spatial rules, which are among the few things that remain valid for articulated objects over many images and which can be expressed in terms of the proposed midrange constraints, i.e. closeness and/or direction. We show, how midrange geometric constraints are formulated within a continuous multi-label optimization framework, and we give a convex relaxation, which allows us to find globally optimal Communicated by Nikos Komodakis. B Julia Diebold julia.diebold@tum.de Claudia Nieuwenhuis cnieuwe@berkeley.edu Daniel Cremers cremers@tum.de 1 Technische Universität München, Munich, Germany 2 ICSI, UC Berkeley, Berkeley, USA solutions of the relaxed problem independent of the initialization.

[1]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Daniel Cremers,et al.  Proximity Priors for Variational Semantic Segmentation and Recognition , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[3]  Daniel Cremers,et al.  Convex Optimization for Scene Understanding , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[4]  Daniel Cremers,et al.  Proportion Priors for Image Sequence Segmentation , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Xiaogang Wang,et al.  Pedestrian Parsing via Deep Decompositional Network , 2013, 2013 IEEE International Conference on Computer Vision.

[6]  Ghassan Hamarneh,et al.  Bounded Labeling Function for Global Segmentation of Multi-part Objects with Geometric Constraints , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Daniel Cremers,et al.  A Co-occurrence Prior for Continuous Multi-label Optimization , 2013, EMMCVPR.

[8]  Daniel Cremers,et al.  Efficient Convex Optimization for Minimal Partition Problems with Volume Constraints , 2013, EMMCVPR.

[9]  Daniel Cremers,et al.  Relative Volume Constraints for Single View 3D Reconstruction , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Peter Kontschieder,et al.  GeoF: Geodesic Forests for Learning Coupled Predictors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Daniel Cremers,et al.  Spatially Varying Color Distributions for Interactive Multilabel Segmentation , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Daniel Cremers,et al.  A Survey and Comparison of Discrete and Continuous Multi-label Optimization Approaches for the Potts Model , 2013, International Journal of Computer Vision.

[13]  Joachim Denzler,et al.  Semantic Segmentation with Millions of Features: Integrating Multiple Cues in a Combined Random Forest Approach , 2012, ACCV.

[14]  Daniel Cremers,et al.  Nonmetric Priors for Continuous Multilabel Optimization , 2012, ECCV.

[15]  Cristian Sminchisescu,et al.  Semantic Segmentation with Second-Order Pooling , 2012, ECCV.

[16]  Lena Gorelick,et al.  Minimizing Energies with Hierarchical Costs , 2012, International Journal of Computer Vision.

[17]  Cristian Sminchisescu,et al.  CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Sanja Fidler,et al.  Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Jitendra Malik,et al.  Semantic segmentation using regions and parts , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Daniel Cremers,et al.  Tight convex relaxations for vector-valued labeling problems , 2011, 2011 International Conference on Computer Vision.

[21]  Antonin Chambolle,et al.  Diagonal preconditioning for first order primal-dual algorithms in convex optimization , 2011, 2011 International Conference on Computer Vision.

[22]  Pascal Fua,et al.  Are spatial and global constraints really necessary for segmentation? , 2011, 2011 International Conference on Computer Vision.

[23]  Joachim M. Buhmann,et al.  Weakly supervised semantic segmentation with a multi-image model , 2011, 2011 International Conference on Computer Vision.

[24]  Daniel Cremers,et al.  Generalized ordering constraints for multilabel optimization , 2011, 2011 International Conference on Computer Vision.

[25]  Charless C. Fowlkes,et al.  Shape-based pedestrian parsing , 2011, CVPR 2011.

[26]  Antonin Chambolle,et al.  A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging , 2011, Journal of Mathematical Imaging and Vision.

[27]  Daniel Cremers,et al.  Image-Based 3D Modeling via Cheeger Sets , 2010, ACCV.

[28]  Pushmeet Kohli,et al.  Graph Cut Based Inference with Co-occurrence Statistics , 2010, ECCV.

[29]  Olga Veksler,et al.  Order-Preserving Moves for Graph-Cut-Based Optimization , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Jiebo Luo,et al.  iCoseg: Interactive co-segmentation with intelligent scribble guidance , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Olga Veksler,et al.  Tiered scene labeling with dynamic programming , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Yuri Boykov,et al.  Globally optimal segmentation of multi-region objects , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Daniel Cremers,et al.  An algorithm for minimizing the Mumford-Shah functional , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[35]  A. Chambolle,et al.  A convex relaxation approach for computing minimal partitions , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  N. Komodakis,et al.  Beyond pairwise energies: Efficient optimization for higher-order MRFs , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Wolfgang Förstner,et al.  eTRIMS Image Database for Interpreting Images of Man-Made Scenes , 2009 .

[38]  A. Fiacco A Finite Algorithm for Finding the Projection of a Point onto the Canonical Simplex of R " , 2009 .

[39]  Stephen Gould,et al.  Multi-Class Segmentation with Relative Location Prior , 2008, International Journal of Computer Vision.

[40]  Pushmeet Kohli,et al.  Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Jan-Michael Frahm,et al.  Fast Global Labeling for Real-Time Stereo Using Multiple Plane Sweeps , 2008, VMV.

[42]  Gang Song,et al.  Object Detection Combining Recognition and Segmentation , 2007, ACCV.

[43]  Pushmeet Kohli,et al.  P3 & Beyond: Solving Energies with Higher Order Cliques , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Alexei A. Efros,et al.  Improving Spatial Support for Objects via Multiple Segmentations , 2007, BMVC.

[45]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[46]  Silvio Savarese,et al.  Discriminative Object Class Models of Appearance and Shape by Correlatons , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[47]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[48]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[49]  Pierre Soille,et al.  Morphological Image Analysis: Principles and Applications , 2003 .

[50]  C. Michelot A finite algorithm for finding the projection of a point onto the canonical simplex of ∝n , 1986 .

[51]  L. R. Dice Measures of the Amount of Ecologic Association Between Species , 1945 .