论文信息 - High-order nonlocal Hashing for unsupervised cross-modal retrieval

High-order nonlocal Hashing for unsupervised cross-modal retrieval

In light of the ability to enable efficient storage and fast query for big data, hashing techniques for cross-modal search have aroused extensive attention. Despite the great success achieved, unsupervised cross-modal hashing still suffers from lacking reliable similarity supervision and struggles with handling the heterogeneity issue between different modalities. To cope with these, in this paper, we devise a new deep hashing model, termed as High-order Nonlocal Hashing (HNH) to facilitate cross-modal retrieval with the following advantages. First, different from existing methods that mainly leverage low-level local-view similarity as the guidance for hashing learning, we propose a high-order affinity measure that considers the multi-modal neighbourhood structures from a nonlocal perspective, thereby comprehensively capturing the similarity relationships between data items. Second, a common representation is introduced to correlate different modalities. By enforcing the modal-specific descriptors and the common representation to be aligned with each other, the proposed HNH significantly bridges the modality gap and maintains the intra-consistency. Third, an effective affinity preserving objective function is delicately designed to generate high-quality binary codes. Extensive experiments evidence the superiority of the proposed HNH in unsupervised cross-modal retrieval tasks over the state-of-the-art baselines.

[1] Dezhong Peng,et al. Scalable Deep Multimodal Learning for Cross-Modal Retrieval , 2019, SIGIR.

[2] Wei Liu,et al. Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval , 2017, AAAI.

[3] Mark J. Huiskes,et al. The MIR flickr retrieval evaluation , 2008, MIR '08.

[4] Guiguang Ding,et al. Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Jun Wang,et al. Self-taught hashing for fast similarity search , 2010, SIGIR.

[6] Wu-Jun Li,et al. Feature Learning Based Deep Supervised Hashing with Pairwise Labels , 2015, IJCAI.

[7] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Antonio Torralba,et al. Spectral Hashing , 2008, NIPS.

[9] Ruifan Li,et al. Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.

[10] Philip S. Yu,et al. Composite Correlation Quantization for Efficient Multimodal Retrieval , 2015, SIGIR.

[11] Zi Huang,et al. Robust Hashing With Local Models for Approximate Similarity Search , 2014, IEEE Transactions on Cybernetics.

[12] Nicole Immorlica,et al. Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[13] Zi Huang,et al. SADIH: Semantic-Aware DIscrete Hashing , 2019, AAAI.

[14] Wei Liu,et al. Hashing with Graphs , 2011, ICML.

[15] Wenwu Zhu,et al. Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[16] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17] Wei Zhang,et al. SCRATCH: A Scalable Discrete Matrix Factorization Hashing for Cross-Modal Retrieval , 2018, ACM Multimedia.

[18] Yang Yang,et al. Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.

[19] Yang Yang,et al. Zero-Shot Hashing via Transferring Supervised Knowledge , 2016, ACM Multimedia.

[20] Zi Huang,et al. Deep Collaborative Discrete Hashing with Semantic-Invariant Structure Construction , 2020 .

[21] Chao Zhang,et al. Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22] Yizhou Wang,et al. Quantized Correlation Hashing for Fast Cross-Modal Search , 2015, IJCAI.

[23] Jungong Han,et al. Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval , 2018, IJCAI.

[24] Zi Huang,et al. Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[25] Quan Wang,et al. Robust and Flexible Discrete Hashing for Cross-Modal Similarity Search , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[26] Roger Levy,et al. A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[27] Zi Huang,et al. Speed up interactive image retrieval , 2008, The VLDB Journal.

[28] Piotr Indyk,et al. Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[29] Alexandr Andoni,et al. Optimal Data-Dependent Hashing for Approximate Near Neighbors , 2015, STOC.

[30] Philip S. Yu,et al. Deep Visual-Semantic Hashing for Cross-Modal Retrieval , 2016, KDD.

[31] Xianglong Liu,et al. Graph Convolutional Network Hashing for Cross-Modal Retrieval , 2019, IJCAI.

[32] Dongqing Zhang,et al. Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[33] Zi Huang,et al. Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[34] Wu-Jun Li,et al. Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Xuelong Li,et al. Deep Binary Reconstruction for Cross-Modal Hashing , 2019 .

[36] Rongrong Ji,et al. Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37] Geoffrey E. Hinton,et al. A general framework for parallel distributed processing , 1986 .

[38] Minyi Guo,et al. Supervised hashing with latent factor models , 2014, SIGIR.

[39] Lei Zhang,et al. Optimal Projection Guided Transfer Hashing for Image Retrieval , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[40] Nitish Srivastava,et al. Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[41] Tat-Seng Chua,et al. NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[42] Raghavendra Udupa,et al. Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[43] Zhi-Hua Zhou,et al. Column Sampling Based Discrete Supervised Hashing , 2016, AAAI.

[44] Zi Huang,et al. Robust discrete code modeling for supervised hashing , 2018, Pattern Recognit..

[45] Lei Zhang,et al. Probability Weighted Compact Feature for Domain Adaptive Retrieval , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46] Guiguang Ding,et al. Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[47] Zi Huang,et al. Sparse hashing for fast multimedia search , 2013, TOIS.

[48] Zijian Wang,et al. Deep Collaborative Discrete Hashing with Semantic-Invariant Structure , 2019, SIGIR.

[49] Xuelong Li,et al. Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[50] Yi Fang,et al. Deep Semantic Text Hashing with Weak Supervision , 2018, SIGIR.

[51] Wei Liu,et al. Semantic Structure-based Unsupervised Deep Hashing , 2018, IJCAI.

[52] Yi Fang,et al. Variational Deep Semantic Hashing for Text Documents , 2017, SIGIR.

[53] Yongxin Wang,et al. SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing , 2018, IJCAI.

[54] Svetlana Lazebnik,et al. Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[55] Jianmin Wang,et al. Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56] Zi Huang,et al. Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.

[57] Heng Tao Shen,et al. Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58] Zi Huang,et al. Discrete Multimodal Hashing With Canonical Views for Robust Mobile Landmark Search , 2017, IEEE Transactions on Multimedia.

[59] Yang Yang,et al. Graph Convolutional Network Hashing , 2020, IEEE Transactions on Cybernetics.

[60] Zi Huang,et al. Exploiting Subspace Relation in Semantic Labels for Cross-Modal Hashing , 2021, IEEE Transactions on Knowledge and Data Engineering.

[61] Yi Zhen,et al. Co-Regularized Hashing for Multimodal Data , 2012, NIPS.