Contextual Exemplar Classifier-Based Image Representation for Classification

The use of local features for image representation has become popular in recent years. Local features are often used in the bag-of-visual-words scheme. Although proven effective, this method still has two drawbacks. First, local regions from which local features are extracted are not discriminative enough for visual tasks. Hence, the combination of local features is necessary. Second, the semantic gap between visual features and human perception also hinders the performance. To address these two problems, in this paper, we propose a novel contextual exemplar classifier-based method for image representation and apply it for classification tasks. Each exemplar classifier is trained to separate one training image from the other images of different classes. We partition each image into a number of regions and use the responses of these exemplar classifiers as the image region’s representation. The contextual relationship is then modeled using mixture Dirichlet distributions. A bilayer model is used to predict image classes with $L_{2}$ constraints. Experimental results on the Natural Scene, Caltech-101/256, Flower-17/102, and SUN-397 data sets show that the proposed method is able to outperform the state-of-the-art local feature-based methods for image classification.

[1]  Adrien Bartoli,et al.  KAZE Features , 2012, ECCV.

[2]  Nuno Vasconcelos,et al.  Scene classification with low-dimensional semantic spaces and weak supervision , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[4]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[6]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Qi Tian,et al.  A Boosting, Sparsity- Constrained Bilinear Model for Object Recognition , 2012, IEEE MultiMedia.

[8]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[9]  Rao Muhammad Anwer,et al.  Color Contribution to Part-Based Person Detection in Different Types of Scenarios , 2011, CAIP.

[10]  Jiaolong Xu,et al.  Domain Adaptation of Deformable Part-Based Models , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Qi Tian,et al.  Image Classification and Retrieval are ONE , 2015, ICMR.

[12]  Jitendra Malik,et al.  SVM-KNN: Discriminative Nearest Neighbor Classification for Visual Category Recognition , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Zhanyi Hu,et al.  Aggregating gradient distributions into intensity orders: A novel local image descriptor , 2011, CVPR 2011.

[14]  Fei-Fei Li,et al.  Object-Centric Spatial Pooling for Image Classification , 2012, ECCV.

[15]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[16]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  David Vázquez,et al.  Random Forests of Local Experts for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[18]  Meng Wang,et al.  Towards optimizing human labeling for interactive image tagging , 2013, TOMCCAP.

[19]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Qi Tian,et al.  Beyond visual features: A weak semantic image representation using exemplar classifiers for classification , 2013, Neurocomputing.

[21]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Qi Tian,et al.  Image classification using Harr-like transformation of local features with coding residuals , 2013, Signal Process..

[23]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[24]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[25]  Andrew W. Fitzgibbon,et al.  Efficient Object Category Recognition Using Classemes , 2010, ECCV.

[26]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Nuno Vasconcelos,et al.  Holistic Context Models for Visual Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Koichi Shinoda,et al.  A Fast and Accurate Video Semantic-Indexing System Using Fast MAP Adaptation and GMM Supervectors , 2012, IEEE Transactions on Multimedia.

[29]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[30]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[31]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[32]  Sebastian Nowozin,et al.  On feature combination for multiclass object classification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[34]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[35]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Xiaoqing Ding,et al.  Detecting Human Action as the Spatio-Temporal Tube of Maximum Mutual Information , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[37]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[38]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[39]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[40]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[41]  Bernt Schiele,et al.  International Journal of Computer Vision manuscript No. (will be inserted by the editor) Semantic Modeling of Natural Scenes for Content-Based Image Retrieval , 2022 .

[42]  Bingbing Ni,et al.  High-Order Local Spatial Context Modeling by Spatialized Random Forest , 2013, IEEE Transactions on Image Processing.

[43]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[44]  Yue Gao,et al.  Exploiting Web Images for Semantic Video Indexing Via Robust Sample-Specific Loss , 2014, IEEE Transactions on Multimedia.

[45]  Qi Tian,et al.  Image classification by non-negative sparse coding, low-rank and sparse decomposition , 2011, CVPR 2011.

[46]  Rong Yan,et al.  Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[47]  Xiaoqin Zhang,et al.  Use bin-ratio information for category and scene classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[48]  Liang-Tien Chia,et al.  Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Jiaolong Xu,et al.  Adapting a Pedestrian Detector by Boosting LDA Exemplar Classifiers , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[50]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Leo Grady,et al.  Isoperimetric graph partitioning for image segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[53]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[54]  Qi Tian,et al.  Object categorization in sub-semantic space , 2014, Neurocomputing.

[55]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[56]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[57]  David Vázquez Cool world : domain adaptation of virtual and real worlds for human detection using active learning , 2012 .

[58]  Jae Wook Jeon,et al.  Support Local Pattern and its Application to Disparity Improvement and Texture Classification , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[59]  Manik Varma,et al.  Learning The Discriminative Power-Invariance Trade-Off , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[60]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[61]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.