Discrete Sparse Hashing for Cross-Modal Similarity Search

Cross-modal hashing approaches have achieved great success on cross-modal similarity search. However, most existing cross-modal hashing methods relax the discrete constraints to solve the hashing model and determine the weights of different modalities manually, which can significantly degrade the performance of retrieval. Besides, they are sensitive to noises because of the widely-utilized \(l_2\)-norm loss function. To address above problems, in this paper, a novel hashing method is proposed to efficiently learn unified binary codes, namely Discrete Sparse Hashing (DSH). In DSH model, unified hash codes are directly learned by discrete sparse coding in sharing low-dimensional latent space for different modalities, where the large quantization error is avoided and the learned codes are robust owing to the sparsity of binary codes. Moreover, the weights of different modalities are adaptively adjusted for training data. Extensive experiments on three databases demonstrate superior performance of DSH over most state-of-the-art methods.

[1]  Dong Yue,et al.  Multi-view low-rank dictionary learning for image classification , 2016, Pattern Recognit..

[2]  P. Schönemann,et al.  A generalized solution of the orthogonal procrustes problem , 1966 .

[3]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[4]  Beng Chin Ooi,et al.  Effective Multi-Modal Retrieval based on Stacked Auto-Encoders , 2014, Proc. VLDB Endow..

[5]  Guiguang Ding,et al.  Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[6]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[7]  Charles Kervrann,et al.  Nonlocal Means and Optimal Weights for Noise Removal , 2017, SIAM J. Imaging Sci..

[8]  Xuelong Li,et al.  Deep Binary Reconstruction for Cross-Modal Hashing , 2017, IEEE Transactions on Multimedia.

[9]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[10]  Dong Yue,et al.  Multi-view Discriminant Dictionary Learning via Learning View-specific and Shared Structured Dictionaries for Image Classification , 2017, Neural Processing Letters.

[11]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[12]  Jonghyun Choi,et al.  Predictable Dual-View Hashing , 2013, ICML.

[13]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[14]  Xuelong Li,et al.  Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[15]  Jianmin Wang,et al.  Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Liang Wang,et al.  Cross-Modal Subspace Learning via Pairwise Constraints , 2014, IEEE Transactions on Image Processing.

[17]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Kien A. Hua,et al.  Linear Subspace Ranking Hashing for Cross-Modal Retrieval , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Wu-Jun Li,et al.  Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Quan Wang,et al.  Robust and Flexible Discrete Hashing for Cross-Modal Similarity Search , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[23]  Yi Zhen,et al.  A probabilistic model for multimodal hash function learning , 2012, KDD.

[24]  Philip S. Yu,et al.  Composite Correlation Quantization for Efficient Multimodal Retrieval , 2015, SIGIR.

[25]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.