Efficient Scale- and Rotation-Invariant Encoding of Visual Words for Image Classification

The problem of incorporating spatial information to the bag-of-visual-words model for image classification is addressed in this letter. To incorporate such information, we propose to encode the global geometric relationships of the visual words in the 2D image plane in a scale- and rotation-invariant manner. This is established by measuring scale- and rotation-invariant geometrical properties given by triangles of identical visual words. Experimental results demonstrate that our proposed method is more robust to changes in scale and image rotations than the bag-of-visual words model on a butterfly and fish dataset.

[1]  Cécile Barat,et al.  Spatial orientations of visual word pairs to improve Bag-of-Visual-Words model , 2012, BMVC.

[2]  Edmond Zhang,et al.  Enhanced Spatial Pyramid Matching Using Log-Polar-Based Image Subdivision and Representation , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.

[3]  Martin Kampel,et al.  Supporting Ancient Coin Classification by Image-Based Reverse Side Symbol Recognition , 2013, CAIP.

[4]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[5]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[6]  Martin Kampel,et al.  Robust Automatic Segmentation of Ancient Coins , 2009, VISAPP.

[7]  William I. Grosky,et al.  Spatial Color Indexing Using Rotation, Translation, and Scale Invariant Anglograms , 2004, Multimedia Tools and Applications.

[8]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Ricardo da Silva Torres,et al.  Visual word spatial arrangement for image retrieval and classification , 2014, Pattern Recognit..

[11]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[12]  Andrew Zisserman,et al.  Sparse kernel approximations for efficient classification and detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Martin Kampel,et al.  Coarse-grained ancient coin classification using image-based reverse side motif recognition , 2015, Machine Vision and Applications.

[14]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[15]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[16]  Martin Kampel,et al.  Encoding Spatial Arrangements of Visual Words for Rotation-Invariant Image Classification , 2014, GCPR.

[17]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[18]  Katja Markert,et al.  Learning Models for Object Recognition from Natural Language Descriptions , 2009, BMVC.