Kernel based latent semantic sparse hashing for large-scale retrieval from heterogeneous data sources

Abstract Recent years, we have witnessed the growing popularity of integrating nearest neighbor search with hashing for effective and efficient similarity search. However, most of the previous cross-modal hashing methods didn’t consider the semantic correlation between multi-modal representations and directly project the heterogeneous data into a joint space using a linear projection. To address these challenges and bridge the semantic gap more efficiently. We proposed a method named kernel based latent semantic sparse hashing (KLSSH) in this paper. We firstly capture high-level latent semantic information and then use the equivalence between optimizing the code inner products and the Hamming distances. More specifically, KLSSH firstly employs sparse coding for obtaining primary latent features of image and matrix factorization for generating features of text concepts to learn latent semantic features in a high level abstraction space. Next, it maps the latent semantic feature to compact binary codes using kernel method. Kernel scheme ensures to sequentially and efficiently train the hash functions one bit at a time and then generate very short and discriminative hash codes. Moreover, it reduces the quantization loss obviously at the same time and makes the retrieval performance better. Experiments conducted on three benchmark multi-modal datasets demonstrate the superiority of our proposed method compared with the state-of-the-art techniques.

[1]  Xiaofeng Zhu,et al.  Zero-shot Image Categorization by Image Correlation Exploration , 2015, ICMR.

[2]  Nitish Srivastava,et al.  Learning Representations for Multimodal Data with Deep Belief Nets , 2012 .

[3]  Wei Wang,et al.  Learning Coupled Feature Spaces for Cross-Modal Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[4]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Units , 2015 .

[5]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[6]  Fei Wang,et al.  Composite hashing with multiple information sources , 2011, SIGIR.

[7]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[8]  Guiguang Ding,et al.  Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[9]  Yang Yang,et al.  Discriminant Cross-modal Hashing , 2016, ICMR.

[10]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[11]  Zhihua Xia,et al.  A Privacy-Preserving and Copy-Deterrence Content-Based Image Retrieval Scheme in Cloud Computing , 2016, IEEE Transactions on Information Forensics and Security.

[12]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[13]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[14]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[15]  Huimin Lu,et al.  Learning unified binary codes for cross-modal retrieval via latent semantic hashing , 2016, Neurocomputing.

[16]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[17]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[19]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[21]  Nikos Paragios,et al.  Data fusion through cross-modality metric learning using similarity-sensitive hashing , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22]  Beng Chin Ooi,et al.  Effective Multi-Modal Retrieval based on Stacked Auto-Encoders , 2014, Proc. VLDB Endow..

[23]  Jian Sun,et al.  K-Means Hashing: An Affinity-Preserving Quantization Method for Learning Binary Compact Codes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Jianmin Wang,et al.  Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[26]  Qiang Liu,et al.  Kernel-based supervised hashing for cross-view similarity search , 2014, 2014 IEEE International Conference on Multimedia and Expo (ICME).

[27]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[28]  Roger Levy,et al.  On the Role of Correlation and Abstraction in Cross-Modal Multimedia Retrieval , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[31]  Seungjin Choi,et al.  Deep Learning to Hash with Multiple Representations , 2012, 2012 IEEE 12th International Conference on Data Mining.