Spatial Graph for Image Classification

Spatial information in images is considered to be of great importance in the process of object recognition. Recent studies show that human's classification accuracy might drop dramatically if the spatial information of an image is removed. The original bag-of-words (BoW) model is actually a system simulating such a classification process with incomplete information. To handle the spatial information, spatial pyramid matching (SPM) was proposed, which has become the most widely used scheme in the purpose of spatial modeling. Given an image, SPM divides it into a series of spatial blocks on several levels and concatenates the representations obtained separately within all the blocks. SPM greatly improves the performance since it embeds spatial information into BoW. However, SPM ignores the relationships between the spatial blocks. To address this problems, we propose a new scheme based on a spatial graph, whose nodes correspond to the spatial blocks in SPM, and edges correspond to the relationships between the blocks. Thorough experiments on several popular datasets verify the advantages of the proposed scheme.

[1]  Tieniu Tan,et al.  Salient coding for image classification , 2011, CVPR 2011.

[2]  Yasuo Kuniyoshi,et al.  Discriminative spatial pyramid , 2011, CVPR 2011.

[3]  Thomas S. Huang,et al.  Image Classification Using Super-Vector Coding of Local Image Descriptors , 2010, ECCV.

[4]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Shawn D. Newsam,et al.  Spatial pyramid co-occurrence for image classification , 2011, 2011 International Conference on Computer Vision.

[6]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[7]  Tieniu Tan,et al.  Group encoding of local features in image classification , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[8]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[9]  Thomas Deselaers,et al.  ClassCut for Unsupervised Class Segmentation , 2010, ECCV.

[10]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[11]  Stefano Soatto,et al.  Proximity Distribution Kernels for Geometric Context in Category Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Shin'ichi Satoh,et al.  Compact correlation coding for visual object categorization , 2011, 2011 International Conference on Computer Vision.

[13]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[14]  Shin'ichi Satoh,et al.  Building Compact Local Pairwise Codebook with Joint Feature Space Clustering , 2010, ECCV.

[15]  Frédéric Jurie,et al.  Modeling spatial layout with fisher vectors for image categorization , 2011, 2011 International Conference on Computer Vision.

[16]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[17]  Devi Parikh Recognizing jumbled images: The role of local and global information in image classification , 2011, 2011 International Conference on Computer Vision.

[18]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[20]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[23]  Wenyu Liu,et al.  Feature context for image classification and object detection , 2011, CVPR 2011.