Encoding Spatial Arrangements of Visual Words for Rotation-Invariant Image Classification

Incorporating the spatial information of visual words enhances the performance of the well-known bag-of-visual words (BoVWs) model for problems like object category recognition. However, object images can undergo various in-plane rotations due to which the spatial information must be added to the BoVWs model in rotation-invariant manner. We present a novel approach to integrate the spatial information to BoVWs model in a rotation-invariant way by encoding the triangular relationship among the positions of identical visual words in the \(2D\) image space. Our proposed BoVWs model is based on densely sampled local features for which the dominant orientations are calculated. Thus we achieve rotation-invariance both globally and locally. We validate our proposed method for rotation-invariance on datasets of ancient coins and butterflies and achieve better performance than the conventional BoVWs model.

[1]  Thomas Deselaers,et al.  Global and efficient self-similarity for object classification and detection , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Katja Markert,et al.  Learning Models for Object Recognition from Natural Language Descriptions , 2009, BMVC.

[4]  Edmond Zhang,et al.  Enhanced Spatial Pyramid Matching Using Log-Polar-Based Image Subdivision and Representation , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[5]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Ricardo da Silva Torres,et al.  Visual word spatial arrangement for image retrieval and classification , 2014, Pattern Recognit..

[7]  Olga Veksler,et al.  Star Shape Prior for Graph-Cut Image Segmentation , 2008, ECCV.

[8]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[9]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[10]  Martin Kampel,et al.  THE ILAC-PROJECT: SUPPORTING ANCIENT COIN CLASSIFICATION BY MEANS OF IMAGE ANALYSIS , 2013 .

[11]  Cécile Barat,et al.  Spatial orientations of visual word pairs to improve Bag-of-Visual-Words model , 2012, BMVC.

[12]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[13]  Martin Kampel,et al.  Robust Automatic Segmentation of Ancient Coins , 2009, VISAPP.

[14]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[15]  Martin Kampel,et al.  Supporting Ancient Coin Classification by Image-Based Reverse Side Symbol Recognition , 2013, CAIP.

[16]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).