Semantic binary coding for visual recognition via joint concept-attribute modelling

Recent years have witnessed the unprecedented efforts of visual representation for enabling various efficient and effective multimedia applications. In this paper, we propose a novel visual representation learning framework, which generates efficient semantic hash codes for visual samples by substantially exploring concepts, semantic attributes as well as their inter-correlations. Specifically, we construct a conceptual space, where the semantic knowledge of concepts and attributes is embedded. Then, we develop an effective on-line feature coding scheme for visual objects by leveraging the inter-concept relationships through the intermediate representative power of attributes. The code process is formulated as an overlapping group lasso problem, which can be efficiently solved. Finally, we may binarize the visual representation to generate efficient hash codes. Extensive experiments have been conducted to illustrate the superiority of our proposed framework on visual retrieval task as compared to state-of-the-art methods.

[1]  Liang-Tien Chia,et al.  Multi-layer group sparse coding — For concurrent image classification and annotation , 2011, CVPR 2011.

[2]  Xuelong Li,et al.  Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[3]  Chao Li,et al.  Image attribute learning with ontology guided fused lasso , 2015, Multimedia Tools and Applications.

[4]  Yang Yang,et al.  Zero-Shot Hashing via Transferring Supervised Knowledge , 2016, ACM Multimedia.

[5]  Shuicheng Yan,et al.  Towards efficient sparse coding for scalable image annotation , 2013, ACM Multimedia.

[6]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Peter Bühlmann Regression shrinkage and selection via the Lasso: a retrospective (Robert Tibshirani): Comments on the presentation , 2011 .

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[10]  Xiaogang Wang,et al.  Learning Deep Representation with Large-Scale Attributes , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[11]  Ling Shao,et al.  Efficient dictionary learning for visual categorization , 2014, Comput. Vis. Image Underst..

[12]  Yi Yang,et al.  Web and Personal Image Annotation by Mining Label Correlation With Relaxed Visual Graph Embedding , 2012, IEEE Transactions on Image Processing.

[13]  Lin Wu,et al.  Exploiting Attribute Correlations: A Novel Trace Lasso-Based Weakly Supervised Dictionary Learning Method , 2017, IEEE Transactions on Cybernetics.

[14]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Min Yao,et al.  Bayesian network based semantic image classification with attributed relational graph , 2014, Multimedia Tools and Applications.

[16]  Ali Farhadi,et al.  Attribute-centric recognition for cross-category generalization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Zi Huang,et al.  Tag localization with spatial correlations and joint group sparsity , 2011, CVPR 2011.

[19]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[20]  David Suter,et al.  Fast Supervised Hashing with Decision Trees for High-Dimensional Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Yi Yang,et al.  Image Classification by Cross-Media Active Learning With Privileged Information , 2016, IEEE Transactions on Multimedia.

[22]  Timothy K. Shih,et al.  Distributed Multimedia Databases: Techniques and Applications , 2001 .

[23]  Yang Yang,et al.  Exploiting Concept Correlation with Attributes for Semantic Binary Representation Learning , 2017, ICIMCS.

[24]  Xin-Ping Guan,et al.  Simultaneous dimensionality reduction and dictionary learning for sparse representation based classification , 2016, Multimedia Tools and Applications.

[25]  Yi Yang,et al.  A Multimedia Retrieval Framework Based on Semi-Supervised Ranking and Relevance Feedback , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Yang Yang,et al.  Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.

[28]  R. Tibshirani,et al.  Regression shrinkage and selection via the lasso: a retrospective , 2011 .

[29]  Jean-Philippe Vert,et al.  Group lasso with overlap and graph lasso , 2009, ICML '09.

[30]  Meng Wang,et al.  Harvesting visual concepts for image search with complex queries , 2012, ACM Multimedia.

[31]  Xuelong Li,et al.  Visual Coding in a Semantic Hierarchy , 2015, ACM Multimedia.

[32]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[33]  Heng Tao Shen,et al.  Hashing with Angular Reconstructive Embeddings , 2018, IEEE Transactions on Image Processing.

[34]  Yue Gao,et al.  Attribute-augmented semantic hierarchy: towards bridging semantic gap and intention gap in image retrieval , 2013, ACM Multimedia.

[35]  Junzhou Huang,et al.  Automatic Image Annotation and Retrieval Using Group Sparsity , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[36]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[37]  Te-Feng Su,et al.  Multi-attributed Dictionary Learning for Sparse Coding , 2013, 2013 IEEE International Conference on Computer Vision.

[38]  Zhi-Hua Zhou,et al.  Column Sampling Based Discrete Supervised Hashing , 2016, AAAI.

[39]  Xuelong Li,et al.  Robust Web Image Annotation via Exploring Multi-Facet and Structural Knowledge , 2017, IEEE Transactions on Image Processing.

[40]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[41]  Zi Huang,et al.  Robust discrete code modeling for supervised hashing , 2018, Pattern Recognit..

[42]  Huanbo Luan,et al.  Discrete Collaborative Filtering , 2016, SIGIR.

[43]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Shuiwang Ji,et al.  SLEP: Sparse Learning with Efficient Projections , 2011 .