Bundled Local Features for Image Representation

Local features have been widely used for image representation. Traditional methods often treat each local feature independently or simply model the correlations of local features with spatial partition. However, local features are correlated and should be jointly modeled. Besides, due to the variety of images, predefined partition rules will probably introduce noisy information. To solve these problems, in this paper we propose a novel bundled local features method for efficient image representation and apply it for classification. Specially, we first extract local features and bundle them together with over-complete spatial shapes by viewing each local feature as the central point. Then, the most discriminatively bundling features are selected by reconstruction error minimization. The encoding parameters are then used for image representations in a matrix form. Finally, we train bi-linear classifiers with quadratic hinge loss to predict the classes of images. The proposed method can combine local features appropriately and efficiently for discriminative representations. Experimental results on three image data sets show the effectiveness of the proposed method compared with other local features combination strategies.

[1]  Ivan Laptev,et al.  Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Jeffrey Ho,et al.  Affine-Constrained Group Sparse Coding and Its Application to Image-Based Classifications , 2013, 2013 IEEE International Conference on Computer Vision.

[4]  Xiaoqin Zhang,et al.  Use bin-ratio information for category and scene classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Stan Sclaroff,et al.  Efficient nearest neighbor classification using a cascade of approximate similarity measures , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Trevor Darrell,et al.  Beyond spatial pyramids: Receptive field learning for pooled image features , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Qi Tian,et al.  Image classification by non-negative sparse coding, low-rank and sparse decomposition , 2011, CVPR 2011.

[8]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  Feng Zhou,et al.  Fine-Grained Categorization and Dataset Bootstrapping Using Deep Metric Learning with Humans in the Loop , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Qi Tian,et al.  A Boosting, Sparsity- Constrained Bilinear Model for Object Recognition , 2012, IEEE MultiMedia.

[12]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Qi Tian,et al.  Fine-Grained Image Classification via Low-Rank Sparse Coding With General and Class-Specific Codebooks , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Liang-Tien Chia,et al.  Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[18]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[19]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Charless C. Fowlkes,et al.  Bilinear classifiers for visual recognition , 2009, NIPS.

[22]  Fei-Fei Li,et al.  Object-Centric Spatial Pooling for Image Classification , 2012, ECCV.

[23]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[26]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[27]  Qingshan Liu,et al.  Elastic Net Hypergraph Learning for Image Clustering and Semi-Supervised Classification , 2016, IEEE Transactions on Image Processing.

[28]  Ujwal Harode,et al.  Bin Ratio-Based Histogram Distances and their Application to Image Classification , 2017 .

[29]  Qi Tian,et al.  Image classification using Harr-like transformation of local features with coding residuals , 2013, Signal Process..

[30]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[31]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[32]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[33]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[34]  Ya Zhang,et al.  Part-Stacked CNN for Fine-Grained Visual Categorization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Huijun Gao,et al.  Feature Combination and the kNN Framework in Object Classification , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Rainer Lienhart,et al.  Robust Feature Bundling , 2012, PCM.

[37]  Thomas Hofmann,et al.  Efficient sparse coding algorithms , 2007 .

[38]  Wen Gao,et al.  CSPS: An Adaptive Pooling Method for Image Classification , 2016, IEEE Transactions on Multimedia.

[39]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[40]  Andrew Zisserman,et al.  Image Classification using Random Forests and Ferns , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[41]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Yi Ma,et al.  Learning Category-Specific Dictionary and Shared Dictionary for Fine-Grained Image Categorization , 2014, IEEE Transactions on Image Processing.

[44]  Jian Yang,et al.  A Locality-Constrained and Label Embedding Dictionary Learning Algorithm for Image Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[45]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.