Image classification by addition of spatial information based on histograms of orthogonal vectors

The Bag-of-Visual-Words (BoVW) model is widely used for image classification, object recognition and image retrieval problems. In BoVW model, the local features are quantized and 2-D image space is represented in the form of order-less histogram of visual words. The image classification performance suffers due to the order-less representation of image. This paper presents a novel image representation that incorporates the spatial information to the inverted index of BoVW model. The spatial information is added by calculating the global relative spatial orientation of visual words in a rotation invariant manner. For this, we computed the geometric relationship between triplets of identical visual words by calculating an orthogonal vector relative to each point in the triplets of identical visual words. The histogram of visual words is calculated on the basis of the magnitude of these orthogonal vectors. This calculation provides the unique information regarding the relative position of visual words when they are collinear. The proposed image representation is evaluated by using four standard image benchmarks. The experimental results and quantitative comparisons demonstrate that the proposed image representation outperforms the existing state-of-the-art in terms of classification accuracy.

[1]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[2]  Ping Tang,et al.  Land-Use Scene Classification Using a Concentric Circle-Structured Multiscale Bag-of-Visual-Words Model , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[3]  Florent Perronnin,et al.  Modeling the spatial layout of images beyond spatial pyramids , 2012, Pattern Recognit. Lett..

[4]  Lirong Dai,et al.  Local Coding Based Matching Kernel Method for Image Classification , 2014, PloS one.

[5]  Qian Du,et al.  Scene classification using local and global features with collaborative representation fusion , 2016, Inf. Sci..

[6]  Frédéric Jurie,et al.  Modeling spatial layout with fisher vectors for image categorization , 2011, 2011 International Conference on Computer Vision.

[7]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[8]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[9]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Shawn D. Newsam,et al.  Spatial pyramid co-occurrence for image classification , 2011, 2011 International Conference on Computer Vision.

[11]  Bo Du,et al.  Scene Classification via a Gradient Boosting Random Convolutional Network Framework , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[12]  Gwangil Jeon,et al.  Content Based Image Retrieval by Using Color Descriptor and Discrete Wavelet Transform , 2018, Journal of Medical Systems.

[13]  Silvio Savarese,et al.  Discriminative Object Class Models of Appearance and Shape by Correlatons , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Yiannis S. Boutalis,et al.  Co.Vi.Wo.: Color Visual Words Based on Non-Predefined Size Codebooks , 2013, IEEE Transactions on Cybernetics.

[15]  Gui-Song Xia,et al.  Bag-of-Visual-Words Scene Classifier With Local and Global Features for High Spatial Resolution Remote Sensing Imagery , 2016, IEEE Geoscience and Remote Sensing Letters.

[16]  Edmond Zhang,et al.  Improving Bag-of-Words model with spatial information , 2010, 2010 25th International Conference of Image and Vision Computing New Zealand.

[17]  Rehan Ashraf,et al.  Content Based Image Retrieval Using Embedded Neural Networks with Bandletized Regions , 2015, Entropy.

[18]  Curt H. Davis,et al.  Training Deep Convolutional Neural Networks for Land–Cover Classification of High-Resolution Imagery , 2017, IEEE Geoscience and Remote Sensing Letters.

[19]  Gang Hua,et al.  Integrated feature selection and higher-order spatial feature extraction for object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Rehan Ashraf,et al.  Content-Based Image Retrieval Based on Late Fusion of Binary and Local Descriptors , 2017, ArXiv.

[21]  Zahid Mehmood,et al.  A Novel Image Retrieval Based on a Combination of Local and Global Histograms of Visual Words , 2016 .

[22]  Bruce A. Draper,et al.  Introduction to the Bag of Features Paradigm for Image Classification and Retrieval , 2011, ArXiv.

[23]  Hao Su,et al.  Object Bank: An Object-Level Image Representation for High-Level Visual Recognition , 2014, International Journal of Computer Vision.

[24]  Ricardo da Silva Torres,et al.  Visual word spatial arrangement for image retrieval and classification , 2014, Pattern Recognit..

[25]  Wen Yang,et al.  High-resolution satellite scene classification using a sparse coding based multiple feature combination , 2012 .

[26]  Hugo Jair Escalante,et al.  Improving the BoVW via discriminative visual n-grams and MKL strategies , 2016, Neurocomputing.

[27]  Lu Wang,et al.  Land-use scene classification using multi-scale completed local binary patterns , 2015, Signal, Image and Video Processing.

[28]  Guojun Lu,et al.  Rotation Invariant Spatial Pyramid Matching for Image Classification , 2015, 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA).

[29]  Yiannis S. Boutalis,et al.  Mean Normalized Retrieval Order (MNRO): a new content-based image retrieval performance measure , 2014, Multimedia Tools and Applications.

[30]  Mathias Lux,et al.  Dimensionality Reduction for Image Features using Deep Learning and Autoencoders , 2017, CBMI.

[31]  Andrew Zisserman,et al.  Sparse kernel approximations for efficient classification and detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Savvas A. Chatzichristofis,et al.  A Novel Image Retrieval Based on Visual Words Integration of SIFT and SURF , 2016, PloS one.

[33]  Gang Hua,et al.  Descriptive visual words and visual phrases for image applications , 2009, ACM Multimedia.

[34]  Jiajun Wang,et al.  Spatial context for visual vocabulary construction , 2010, 2010 International Conference on Image Analysis and Signal Processing.

[35]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[36]  Naif Alajlan,et al.  Land-Use Classification With Compressive Sensing Multifeature Fusion , 2015, IEEE Geoscience and Remote Sensing Letters.

[37]  Qian Du,et al.  Fusing Local and Global Features for High-Resolution Scene Classification , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[38]  Wolfgang Effelsberg,et al.  Enhancing bag of visual words with color information for iconic image classification , 2017 .

[39]  Savvas A. Chatzichristofis,et al.  CoMo: A Compact Composite Moment-Based Descriptor for Image Retrieval , 2017, CBMI.

[40]  Cécile Barat,et al.  Spatial histograms of soft pairwise similar patches to improve the bag-of-visual-words model , 2015, Comput. Vis. Image Underst..

[41]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[42]  Curt H. Davis,et al.  Fusion of Deep Convolutional Neural Networks for Land Cover Classification of High-Resolution Imagery , 2017, IEEE Geoscience and Remote Sensing Letters.

[43]  Mathias Lux,et al.  Combining Color and Spatial Color Distribution Information in a Fuzzy Rule Based Compact Composite Descriptor , 2010, ICAART.

[44]  N. H. C. Yung,et al.  Scene categorization via contextual visual words , 2010, Pattern Recognit..

[45]  Ying Liu,et al.  Considering the Spatial Layout Information of Bag of Features (BoF) Framework for Image Classification , 2015, PloS one.

[46]  Krystian Mikolajczyk,et al.  Spatial Coordinate Coding to reduce histogram representations, Dominant Angle and Colour Pyramid Match , 2011, 2011 18th IEEE International Conference on Image Processing.

[47]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[48]  Rehan Ashraf,et al.  Content-based Image Retrieval by Exploring Bandletized Regions through Support Vector Machines , 2016, J. Inf. Sci. Eng..

[49]  Luis Herranz,et al.  Joint multi-feature spatial context for scene recognition in the semantic manifold , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Martin Kampel,et al.  Encoding Spatial Arrangements of Visual Words for Rotation-Invariant Image Classification , 2014, GCPR.

[51]  Tong Liu,et al.  A pooled Object Bank descriptor for image scene classification , 2018, Expert Syst. Appl..

[52]  Chunxiao Zhang,et al.  An effective bag-of-visual-word scheme for object recognition , 2016, 2016 9th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI).

[53]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[54]  Zahid Mehmood,et al.  Image retrieval by addition of spatial information based on histograms of triangular regions , 2016, Comput. Electr. Eng..

[55]  Jefersson Alex dos Santos,et al.  Do deep features generalize from everyday objects to remote sensing and aerial scenes domains? , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[56]  Md. Monirul Islam,et al.  A review on automatic image annotation techniques , 2012, Pattern Recognit..

[57]  Gang Hua,et al.  Generating Descriptive Visual Words and Visual Phrases for Large-Scale Image Applications , 2011, IEEE Transactions on Image Processing.

[58]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[59]  Cécile Barat,et al.  Spatial orientations of visual word pairs to improve Bag-of-Visual-Words model , 2012, BMVC.

[60]  Edmond Zhang,et al.  Enhanced Spatial Pyramid Matching Using Log-Polar-Based Image Subdivision and Representation , 2010, 2010 International Conference on Digital Image Computing: Techniques and Applications.