On the use of regions for semantic image segmentation

There is a general trend in recent methods to use image regions (i.e. super-pixels) obtained in an unsupervised way to enhance the semantic image segmentation task. This paper proposes a detailed study on the role and the benefit of using these regions, at different steps of the segmentation process. For the purpose of this benchmark, we propose a simple system for semantic segmentation that uses a hierarchy of regions. A patch based system with similar settings is compared, which allows us to evaluate the contribution of each component of the system. Both systems are evaluated on the standard MSRC-21 dataset and obtain competitive results. We show that the proposed region based system can achieve good results without any complex regularization, while its patch based counterpart becomes competitive when using image prior and regularization methods. The latter benefit more from a CRF based regularization, yielding to state-of-the-art results with simple constraints based only on the leaf regions exploited in the pairwise potential.

[1]  Martial Hebert,et al.  Stacked Hierarchical Labeling , 2010, ECCV.

[2]  Jitendra Malik,et al.  Context by region ancestry , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[3]  Charless C. Fowlkes,et al.  Contour Detection and Hierarchical Image Segmentation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Pushmeet Kohli,et al.  Robust Higher Order Potentials for Enforcing Label Consistency , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[6]  Jitendra Malik,et al.  From contours to regions: An empirical evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[8]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[9]  Jitendra Malik,et al.  Using contours to detect and localize junctions in natural images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Julian R. Ullmann,et al.  Pattern recognition techniques , 1973 .

[11]  Bill Triggs,et al.  Scene Segmentation with CRFs Learned from Partially Labeled Images , 2007, NIPS.

[12]  Pushmeet Kohli,et al.  Graph Cut Based Inference with Co-occurrence Statistics , 2010, ECCV.

[13]  Jitendra Malik,et al.  Recognition using regions , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[15]  Pushmeet Kohli,et al.  Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[16]  Lin Yang,et al.  Multiple Class Segmentation Using A Unified Framework over Mean-Shift Patches , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Pascal Fua,et al.  Are spatial and global constraints really necessary for segmentation? , 2011, 2011 International Conference on Computer Vision.

[19]  Kristen Grauman,et al.  Efficient region search for object detection , 2011, CVPR 2011.

[20]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[21]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Joost van de Weijer,et al.  Harmony potentials for joint classification and segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[23]  Andrew Zisserman,et al.  Pylon Model for Semantic Segmentation , 2011, NIPS.

[24]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[25]  Stephen Gould,et al.  Decomposing a scene into geometric and semantically consistent regions , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[26]  Larry S. Davis,et al.  Piecing together the segmentation jigsaw using context , 2011, CVPR 2011.

[27]  Gabriela Csurka,et al.  An Efficient Approach to Semantic Segmentation , 2011, International Journal of Computer Vision.

[28]  F. Perronnin,et al.  XRCE ’ s participation to ImagEval , 2007 .

[29]  Peng-Yeng Yin,et al.  Pattern Recognition Techniques, Technology and Applications , 2008 .

[30]  Kpalma Kidiyo,et al.  A Survey of Shape Feature Extraction Techniques , 2008 .

[31]  Andrew Zisserman,et al.  Efficient retrieval of deformable shape classes using local self-similarities , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[32]  Rui Hu,et al.  Gradient field descriptor for sketch based retrieval and localization , 2010, 2010 IEEE International Conference on Image Processing.

[33]  Cordelia Schmid,et al.  Object Recognition by Integrating Multiple Image Segmentations , 2008, ECCV.