Active MAP Inference in CRFs for Efficient Semantic Segmentation

Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 and MSRC-21, show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.

[1]  Pascal Fua,et al.  Are spatial and global constraints really necessary for segmentation? , 2011, 2011 International Conference on Computer Vision.

[2]  Pascal Fua,et al.  SLIC Superpixels Compared to State-of-the-Art Superpixel Methods , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[4]  Marc Toussaint,et al.  Multi-class image segmentation using conditional random fields and global classification , 2009, ICML '09.

[5]  George Papandreou,et al.  Perturb-and-MAP random fields: Using discrete optimization to learn and sample from energy models , 2011, 2011 International Conference on Computer Vision.

[6]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[7]  B. Triggs,et al.  Scene segmentation with Conditional Random Fields learned from partially labeled images , 2007, NIPS 2007.

[8]  Joost van de Weijer,et al.  Fusing Global and Local Scale for Semantic Image Segmentation , 2011 .

[9]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10]  Pietro Perona,et al.  Object detection and segmentation from joint embedding of parts and pixels , 2011, 2011 International Conference on Computer Vision.

[11]  Martial Hebert,et al.  Stacked Hierarchical Labeling , 2010, ECCV.

[12]  Martin J. Wainwright,et al.  MAP estimation via agreement on trees: message-passing and linear programming , 2005, IEEE Transactions on Information Theory.

[13]  Gabriela Csurka,et al.  An Efficient Approach to Semantic Segmentation , 2011, International Journal of Computer Vision.

[14]  Martin J. Wainwright,et al.  On the Optimality of Tree-reweighted Max-product Message-passing , 2005, UAI.

[15]  Joost van de Weijer,et al.  Harmony Potentials , 2011, International Journal of Computer Vision.

[16]  Pushmeet Kohli,et al.  Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Joachim M. Buhmann,et al.  Active learning for semantic segmentation with expected change , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Ryan P. Adams,et al.  Randomized Optimum Models for Structured Prediction , 2012, AISTATS.

[19]  Jiayan Jiang,et al.  Efficient scale space auto-context for image segmentation and labeling , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[22]  Stephen Gould,et al.  Multi-Class Segmentation with Relative Location Prior , 2008, International Journal of Computer Vision.

[23]  Vladimir Kolmogorov,et al.  An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision , 2004, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[25]  Cordelia Schmid,et al.  Object Recognition by Integrating Multiple Image Segmentations , 2008, ECCV.

[26]  Stefano Soatto,et al.  Class segmentation and object localization with superpixel neighborhoods , 2009, 2009 IEEE 12th International Conference on Computer Vision.