High-order nonlocal Hashing for unsupervised cross-modal retrieval

In light of the ability to enable efficient storage and fast query for big data, hashing techniques for cross-modal search have aroused extensive attention. Despite the great success achieved, unsupervised cross-modal hashing still suffers from lacking reliable similarity supervision and struggles with handling the heterogeneity issue between different modalities. To cope with these, in this paper, we devise a new deep hashing model, termed as High-order Nonlocal Hashing (HNH) to facilitate cross-modal retrieval with the following advantages. First, different from existing methods that mainly leverage low-level local-view similarity as the guidance for hashing learning, we propose a high-order affinity measure that considers the multi-modal neighbourhood structures from a nonlocal perspective, thereby comprehensively capturing the similarity relationships between data items. Second, a common representation is introduced to correlate different modalities. By enforcing the modal-specific descriptors and the common representation to be aligned with each other, the proposed HNH significantly bridges the modality gap and maintains the intra-consistency. Third, an effective affinity preserving objective function is delicately designed to generate high-quality binary codes. Extensive experiments evidence the superiority of the proposed HNH in unsupervised cross-modal retrieval tasks over the state-of-the-art baselines.

[1]  Dezhong Peng,et al.  Scalable Deep Multimodal Learning for Cross-Modal Retrieval , 2019, SIGIR.

[2]  Wei Liu,et al.  Pairwise Relationship Guided Deep Hashing for Cross-Modal Retrieval , 2017, AAAI.

[3]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[4]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jun Wang,et al.  Self-taught hashing for fast similarity search , 2010, SIGIR.

[6]  Wu-Jun Li,et al.  Feature Learning Based Deep Supervised Hashing with Pairwise Labels , 2015, IJCAI.

[7]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[9]  Ruifan Li,et al.  Cross-modal Retrieval with Correspondence Autoencoder , 2014, ACM Multimedia.

[10]  Philip S. Yu,et al.  Composite Correlation Quantization for Efficient Multimodal Retrieval , 2015, SIGIR.

[11]  Zi Huang,et al.  Robust Hashing With Local Models for Approximate Similarity Search , 2014, IEEE Transactions on Cybernetics.

[12]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[13]  Zi Huang,et al.  SADIH: Semantic-Aware DIscrete Hashing , 2019, AAAI.

[14]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[15]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Wei Zhang,et al.  SCRATCH: A Scalable Discrete Matrix Factorization Hashing for Cross-Modal Retrieval , 2018, ACM Multimedia.

[18]  Yang Yang,et al.  Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.

[19]  Yang Yang,et al.  Zero-Shot Hashing via Transferring Supervised Knowledge , 2016, ACM Multimedia.

[20]  Zi Huang,et al.  Deep Collaborative Discrete Hashing with Semantic-Invariant Structure Construction , 2020 .

[21]  Chao Zhang,et al.  Deep Joint-Semantics Reconstructing Hashing for Large-Scale Unsupervised Cross-Modal Retrieval , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Yizhou Wang,et al.  Quantized Correlation Hashing for Fast Cross-Modal Search , 2015, IJCAI.

[23]  Jungong Han,et al.  Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval , 2018, IJCAI.

[24]  Zi Huang,et al.  Inter-media hashing for large-scale retrieval from heterogeneous data sources , 2013, SIGMOD '13.

[25]  Quan Wang,et al.  Robust and Flexible Discrete Hashing for Cross-Modal Similarity Search , 2018, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Roger Levy,et al.  A new approach to cross-modal multimedia retrieval , 2010, ACM Multimedia.

[27]  Zi Huang,et al.  Speed up interactive image retrieval , 2008, The VLDB Journal.

[28]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[29]  Alexandr Andoni,et al.  Optimal Data-Dependent Hashing for Approximate Near Neighbors , 2015, STOC.

[30]  Philip S. Yu,et al.  Deep Visual-Semantic Hashing for Cross-Modal Retrieval , 2016, KDD.

[31]  Xianglong Liu,et al.  Graph Convolutional Network Hashing for Cross-Modal Retrieval , 2019, IJCAI.

[32]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[33]  Zi Huang,et al.  Linear cross-modal hashing for efficient multimedia search , 2013, ACM Multimedia.

[34]  Wu-Jun Li,et al.  Deep Cross-Modal Hashing , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Xuelong Li,et al.  Deep Binary Reconstruction for Cross-Modal Hashing , 2019 .

[36]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Geoffrey E. Hinton,et al.  A general framework for parallel distributed processing , 1986 .

[38]  Minyi Guo,et al.  Supervised hashing with latent factor models , 2014, SIGIR.

[39]  Lei Zhang,et al.  Optimal Projection Guided Transfer Hashing for Image Retrieval , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[40]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[41]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[42]  Raghavendra Udupa,et al.  Learning Hash Functions for Cross-View Similarity Search , 2011, IJCAI.

[43]  Zhi-Hua Zhou,et al.  Column Sampling Based Discrete Supervised Hashing , 2016, AAAI.

[44]  Zi Huang,et al.  Robust discrete code modeling for supervised hashing , 2018, Pattern Recognit..

[45]  Lei Zhang,et al.  Probability Weighted Compact Feature for Domain Adaptive Retrieval , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Guiguang Ding,et al.  Latent semantic sparse hashing for cross-modal similarity search , 2014, SIGIR.

[47]  Zi Huang,et al.  Sparse hashing for fast multimedia search , 2013, TOIS.

[48]  Zijian Wang,et al.  Deep Collaborative Discrete Hashing with Semantic-Invariant Structure , 2019, SIGIR.

[49]  Xuelong Li,et al.  Learning Discriminative Binary Codes for Large-scale Cross-modal Retrieval , 2017, IEEE Transactions on Image Processing.

[50]  Yi Fang,et al.  Deep Semantic Text Hashing with Weak Supervision , 2018, SIGIR.

[51]  Wei Liu,et al.  Semantic Structure-based Unsupervised Deep Hashing , 2018, IJCAI.

[52]  Yi Fang,et al.  Variational Deep Semantic Hashing for Text Documents , 2017, SIGIR.

[53]  Yongxin Wang,et al.  SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing , 2018, IJCAI.

[54]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[55]  Jianmin Wang,et al.  Semantics-preserving hashing for cross-view retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Zi Huang,et al.  Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.

[57]  Heng Tao Shen,et al.  Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Zi Huang,et al.  Discrete Multimodal Hashing With Canonical Views for Robust Mobile Landmark Search , 2017, IEEE Transactions on Multimedia.

[59]  Yang Yang,et al.  Graph Convolutional Network Hashing , 2020, IEEE Transactions on Cybernetics.

[60]  Zi Huang,et al.  Exploiting Subspace Relation in Semantic Labels for Cross-Modal Hashing , 2021, IEEE Transactions on Knowledge and Data Engineering.

[61]  Yi Zhen,et al.  Co-Regularized Hashing for Multimodal Data , 2012, NIPS.