Image Classification Based on Modified BOW Model

Image classifications the basis to solve visual tracking, image segmentation, scenes understanding and other complex visual tasks. Bag of words(Bow) model is initially applied in text classification area, introduced in image proceeding and recognition on account of its simple and effective. This paper follows the standard bag-of-words pipeline, but substitutes original SIFT descriptor with DSP-SIFT(domain-size pooling sift). Then, taking account of the truth that an DSP-SIFT was developed for gray images which limits its performance with regard to some colored objects, we present to add CN(color-name) descriptor to collect the color information to form visual words of image. The experimental results over fifteen scene categories and Caltech 101 datasets indicate that our method can achieve very promising performance.

[1]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[2]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Ming-Ai Li,et al.  A novel feature extraction method for scene recognition based on Centered Convolutional Restricted Boltzmann Machines , 2015, Neurocomputing.

[4]  Inderjit S. Dhillon,et al.  A Divisive Information-Theoretic Feature Clustering Algorithm for Text Classification , 2003, J. Mach. Learn. Res..

[5]  Thomas Brox,et al.  Unsupervised feature learning by augmenting single images , 2013, ICLR.

[6]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[7]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Subhransu Maji,et al.  Classification using intersection kernel support vector machines is efficient , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Aly A. Farag,et al.  CSIFT: A SIFT Descriptor with Color Invariant Characteristics , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[11]  Inderjit S. Dhillon,et al.  Information-theoretic co-clustering , 2003, KDD '03.

[12]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[13]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[14]  Joost van de Weijer,et al.  Describing Reflectances for Color Segmentation Robust to Shadows, Highlights, and Textures , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Cordelia Schmid,et al.  Learning Color Names for Real-World Applications , 2009, IEEE Transactions on Image Processing.

[16]  Jonathan Balzer,et al.  Multi-view feature engineering and learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Qi Tian,et al.  Image classification by non-negative sparse coding, low-rank and sparse decomposition , 2011, CVPR 2011.

[18]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[19]  Thomas Brox,et al.  Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT , 2014, ArXiv.

[20]  Matthew A. Brown,et al.  Invariant Features from Interest Point Groups , 2002, BMVC.

[21]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[22]  Fahad Shahbaz Khan,et al.  Discriminative Color Descriptors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Stefano Soatto,et al.  Domain-size pooling in local descriptors: DSP-SIFT , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).