A Simple High Performance Approach to Semantic Segmentation

We propose a simple approach to semantic image segmentation. Our system scores low-level patches according to their class re levance, propagates these posterior probabilities to pixels and uses low- level segmentation to guide the semantic segmentation. The two main contributions of this paper are as follows. First, for the patch scoring, we describe each patch with a high-level descriptor based on the Fisher kernel and use a s et of linear classifiers. While the Fisher kernel methodology was shown t o lead to high accuracy for image classification, it has not been applied to the segmentation problem. Second, we use global image classifiers to take into account the context of the objects to be segmented. If an image as a whole is unlikely to contain an object class, then the corresponding class is not considered in the segmentation pipeline. This increases the classification a ccuracy and reduces the computational cost. We will show that despite its apparent simplicity, this system provides above state-of-the-art performance on the PASCAL VOC 2007 dataset and state-of-the-art performance on the MSRC 21 dataset.

[1]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Shimon Ullman,et al.  Combining Top-Down and Bottom-Up Segmentation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[4]  R. Zemel,et al.  Multiscale conditional random fields for image labeling , 2004, CVPR 2004.

[5]  Nebojsa Jojic,et al.  LOCUS: learning object classes with unsupervised segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[6]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[7]  Fei-Fei Li,et al.  Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  Lawrence Carin,et al.  Sparse multinomial logistic regression: fast algorithms and generalization bounds , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  B. Triggs,et al.  Scene segmentation with Conditional Random Fields learned from partially labeled images , 2007, NIPS 2007.

[11]  Bill Triggs,et al.  Region Classification with Markov Field Aspect Models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[14]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[15]  Antonio Criminisi,et al.  TextonBoost: Joint Appearance, Shape and Context Modeling for Multi-class Object Recognition and Segmentation , 2006, ECCV.

[16]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[17]  Alexei A. Efros,et al.  Using Multiple Segmentations to Discover Objects and their Extent in Image Collections , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Lin Yang,et al.  Multiple Class Segmentation Using A Unified Framework over Mean-Shift Patches , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Martial Hebert,et al.  A hierarchical field framework for unified context-based classification , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[20]  Andrew Zisserman,et al.  OBJ CUT , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[21]  Jamie Shotton,et al.  The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[23]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .