Structured Weak Semantic Space Construction for Visual Categorization

Visual features have been widely used for image representation and categorization. However, visual features are often inconsistent with human perception. Besides, constructing explicit semantic space is still an open problem. To alleviate these two problems, in this paper, we propose to construct structured weak semantic space for image representation. Exemplar classifier is first trained to separate each training image from other images for weak semantic space construction. However, each exemplar classifier separates one training image from other images, and it only has limited semantic separability. Besides, the outputs of exemplar classifiers are inconsistent with each other. We jointly construct the weak semantic space using structured constraint. This is achieved by imposing low-rank constraint on the outputs of exemplar classifiers with sparsity constraint. An alternative optimization procedure is used to learn the exemplar classifiers. Since the proposed method does not dependent on the initial image representation strategy, we can make use of various visual features for efficient exemplar classifier training (e.g., fisher vector-based methods and convolutional neural networks-based methods). We apply the proposed structured weak semantic space-based image representation method for categorization. The experimental results on several public image data sets prove the effectiveness of the proposed method.

[1]  Bingbing Ni,et al.  HCP: A Flexible CNN Framework for Multi-Label Image Classification , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Leo Grady,et al.  Isoperimetric graph partitioning for image segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Qi Tian,et al.  Beyond visual features: A weak semantic image representation using exemplar classifiers for classification , 2013, Neurocomputing.

[4]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[5]  Florent Perronnin,et al.  Fisher vectors meet Neural Networks: A hybrid classification architecture , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Qi Tian,et al.  Beyond Explicit Codebook Generation: Visual Representation Using Implicitly Transferred Codebooks , 2015, IEEE Transactions on Image Processing.

[8]  Qiang Yang,et al.  Heterogeneous Transfer Learning for Image Classification , 2011, AAAI.

[9]  Thomas Mensink,et al.  Image Classification with the Fisher Vector: Theory and Practice , 2013, International Journal of Computer Vision.

[10]  John Wright,et al.  RASL: Robust Alignment by Sparse and Low-Rank Decomposition for Linearly Correlated Images , 2012, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Andrew W. Fitzgibbon,et al.  Efficient Object Category Recognition Using Classemes , 2010, ECCV.

[13]  Nuno Vasconcelos,et al.  Holistic Context Models for Visual Recognition , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[15]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[16]  Cewu Lu,et al.  Deep LAC: Deep localization, alignment and classification for fine-grained recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[18]  Cor J. Veenman,et al.  Visual Word Ambiguity , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Yihong Gong,et al.  Locality-constrained Linear Coding for image classification , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Cordelia Schmid,et al.  Image categorization using Fisher kernels of non-iid image models , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Qi Tian,et al.  Image classification by non-negative sparse coding, low-rank and sparse decomposition , 2011, CVPR 2011.

[23]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[24]  Luc Van Gool,et al.  TriCoS: A Tri-level Class-Discriminative Co-segmentation Method for Image Classification , 2012, ECCV.

[25]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[26]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[27]  Jian Yang,et al.  A Locality-Constrained and Label Embedding Dictionary Learning Algorithm for Image Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Andrew Zisserman,et al.  Scene Classification Using a Hybrid Generative/Discriminative Approach , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Wen Gao,et al.  Group-sensitive multiple kernel learning for object categorization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[30]  David A. Forsyth,et al.  Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary , 2002, ECCV.

[31]  Changhu Wang,et al.  Scalable search-based image annotation , 2008, Multimedia Systems.

[32]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[35]  Wei-Ying Ma,et al.  Image annotation using search and mining technologies , 2006, WWW '06.

[36]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[37]  Qi Tian,et al.  Image Classification and Retrieval are ONE , 2015, ICMR.

[38]  Changsheng Li,et al.  Ordinal Distance Metric Learning for Image Ranking , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[39]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[40]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[41]  Davide Modolo,et al.  Joint calibration of Ensemble of Exemplar SVMs , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Andrew Zisserman,et al.  The devil is in the details: an evaluation of recent feature encoding methods , 2011, BMVC.

[43]  Zhanyi Hu,et al.  Aggregating gradient distributions into intensity orders: A novel local image descriptor , 2011, CVPR 2011.

[44]  Alexei A. Efros,et al.  Ensemble of exemplar-SVMs for object detection and beyond , 2011, 2011 International Conference on Computer Vision.

[45]  Shuicheng Yan,et al.  Visual classification with multi-task joint sparse representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[46]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[47]  Xuelong Li,et al.  Image Categorization by Learning a Propagated Graphlet Path , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[48]  Xiaolin Hu,et al.  Recurrent convolutional neural network for object recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Nuno Vasconcelos,et al.  Scene classification with low-dimensional semantic spaces and weak supervision , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[51]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Chen Xu,et al.  The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding , 2014, International Journal of Computer Vision.

[53]  Liang-Tien Chia,et al.  Laplacian Sparse Coding, Hypergraph Laplacian Sparse Coding, and Applications , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Lei Zhang,et al.  Local Log-Euclidean Multivariate Gaussian Descriptor and Its Application to Image Classification. , 2017, IEEE transactions on pattern analysis and machine intelligence.

[55]  Qi Tian,et al.  Object categorization in sub-semantic space , 2014, Neurocomputing.

[56]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[57]  R. Manmatha,et al.  Automatic image annotation and retrieval using cross-media relevance models , 2003, SIGIR.

[58]  Fei Wang,et al.  NPIC: Hierarchical Synthetic Image Classification Using Image Search and Generic Features , 2006, CIVR.

[59]  Patrick Pérez,et al.  Exemplar SVMs as visual feature encoders , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Andrew Zisserman,et al.  Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[61]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[62]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  G. Griffin,et al.  Caltech-256 Object Category Dataset , 2007 .

[64]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[65]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[66]  Jianguo Zhang,et al.  The PASCAL Visual Object Classes Challenge , 2006 .

[67]  Changsheng Xu,et al.  Low-Rank Sparse Coding for Image Classification , 2013, 2013 IEEE International Conference on Computer Vision.

[68]  Jae Wook Jeon,et al.  Support Local Pattern and its Application to Disparity Improvement and Texture Classification , 2014, IEEE Transactions on Circuits and Systems for Video Technology.

[69]  Qi Tian,et al.  Undo the codebook bias by linear transformation for visual applications , 2013, ACM Multimedia.

[70]  Satoshi Ito,et al.  Object Classification Using Heterogeneous Co-occurrence Features , 2010, ECCV.

[71]  Xin Yang,et al.  Learning the Conformal Transformation Kernel for Image Recognition , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[72]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[73]  Bernt Schiele,et al.  International Journal of Computer Vision manuscript No. (will be inserted by the editor) Semantic Modeling of Natural Scenes for Content-Based Image Retrieval , 2022 .