Object Categorization Using Class-Specific Representations

Object categorization refers to the task of automatically classifying objects based on the visual content. Existing approaches simply represent each image with the visual features without considering the specific characters of images within the same class. However, objects of the same class may exhibit unique characters, which should be represented accordingly. In this brief, we propose a novel class-specific representation strategy for object categorization. For each class, we first model the characters of images within the same class using Gaussian mixture model (GMM). We then represent each image by calculating the Euclidean distance and relative Euclidean distance between the image and the GMM model for each class. We concatenate the representations of each class for joint representation. In this way, we can represent an image by not only considering the visual contents but also combining the class-specific characters. Experiments on several public available data sets validate the superiority of the proposed class-specific representation method over well-established algorithms for object category predictions.

[1]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Jian Dong,et al.  Contextualizing Object Detection and Classification , 2015, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[4]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Qi Tian,et al.  Object categorization in sub-semantic space , 2014, Neurocomputing.

[6]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[7]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[8]  Rob Fergus,et al.  Stochastic Pooling for Regularization of Deep Convolutional Neural Networks , 2013, ICLR.

[9]  Tao Mei,et al.  Contextual Bag-of-Words for Visual Categorization , 2011, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[11]  Andrew Zisserman,et al.  A Visual Vocabulary for Flower Classification , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Zhiwei Li,et al.  Max-Margin Dictionary Learning for Multiclass Image Categorization , 2010, ECCV.

[13]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Xiaoqin Zhang,et al.  Use bin-ratio information for category and scene classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Qi Tian,et al.  Bundled Local Features for Image Representation , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[16]  Nuno Vasconcelos,et al.  Holistic Context Models for Visual Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[18]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[19]  James M. Rehg,et al.  Beyond the Euclidean distance: Creating effective visual codebooks using the Histogram Intersection Kernel , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[21]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Xuelong Li,et al.  Image Categorization by Learning a Propagated Graphlet Path , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[24]  Yi Ma,et al.  Learning Category-Specific Dictionary and Shared Dictionary for Fine-Grained Image Categorization , 2014, IEEE Transactions on Image Processing.

[25]  Qi Tian,et al.  Image classification by non-negative sparse coding, low-rank and sparse decomposition , 2011, CVPR 2011.

[26]  Yoshua Bengio,et al.  Maxout Networks , 2013, ICML.

[27]  Qi Tian,et al.  Structured Weak Semantic Space Construction for Visual Categorization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Huijun Gao,et al.  Feature Combination and the kNN Framework in Object Classification , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[29]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  Trevor Darrell,et al.  Beyond spatial pyramids: Receptive field learning for pooled image features , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  H. T. Kung,et al.  Stable and Efficient Representation Learning with Nonnegativity Constraints , 2014, ICML.

[32]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Tao Mei,et al.  Image Tag Refinement With View-Dependent Concept Representations , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[34]  Majid Komeili,et al.  Local Feature Selection for Data Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[36]  Min Xiao,et al.  Feature Space Independent Semi-Supervised Domain Adaptation via Kernel Matching , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Tao Mei,et al.  Relaxing from Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Qi Tian,et al.  Fine-Grained Image Classification via Low-Rank Sparse Coding With General and Class-Specific Codebooks , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Nitish Srivastava,et al.  Discriminative Transfer Learning with Tree-based Priors , 2013, NIPS.

[40]  Qi Tian,et al.  Incremental Codebook Adaptation for Visual Representation and Categorization , 2018, IEEE Transactions on Cybernetics.

[41]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[42]  Hugo Larochelle,et al.  A Deep and Autoregressive Approach for Topic Modeling of Multimodal Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Qi Tian,et al.  Multiview Label Sharing for Visual Representations and Classifications , 2018, IEEE Transactions on Multimedia.

[44]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[45]  Matthieu Guillaumin,et al.  Incremental Learning of Random Forests for Large-Scale Image Classification , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Qi Tian,et al.  Beyond Explicit Codebook Generation: Visual Representation Using Implicitly Transferred Codebooks , 2015, IEEE Transactions on Image Processing.

[47]  Tao Mei,et al.  Concurrent Multiple Instance Learning for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Fei-Fei Li,et al.  What, where and who? Classifying events by scene and object recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[50]  Jian Yang,et al.  A Locality-Constrained and Label Embedding Dictionary Learning Algorithm for Image Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.