Deep Collaborative Multi-View Hashing for Large-Scale Image Search

Hashing could significantly accelerate large-scale image search by transforming the high-dimensional features into binary Hamming space, where efficient similarity search can be achieved with very fast Hamming distance computation and extremely low storage cost. As an important branch of hashing methods, multi-view hashing takes advantages of multiple features from different views for binary hash learning. However, existing multi-view hashing methods are either based on shallow models which fail to fully capture the intrinsic correlations of heterogeneous views, or unsupervised deep models which suffer from insufficient semantics and cannot effectively exploit the complementarity of view features. In this paper, we propose a novel Deep Collaborative Multi-view Hashing (DCMVH) method to deeply fuse multi-view features and learn multi-view hash codes collaboratively under a deep architecture. DCMVH is a new deep multi-view hash learning framework. It mainly consists of 1) multiple view-specific networks to extract hidden representations of different views, and 2) a fusion network to learn multi-view fused hash code. DCMVH associates different layers with instance-wise and pair-wise semantic labels respectively. In this way, the discriminative capability of representation layers can be progressively enhanced and meanwhile the complementarity of different view features can be exploited effectively. Finally, we develop a fast discrete hash optimization method based on augmented Lagrangian multiplier to efficiently solve the binary hash codes. Experiments on public multi-view image search datasets demonstrate our approach achieves substantial performance improvement over state-of-the-art methods.

[1]  Youngjoong Ko,et al.  A study of term weighting schemes using class information for text classification , 2012, SIGIR '12.

[2]  Bingbing Ni,et al.  Binary Coding for Partial Action Analysis with Limited Observation Ratios , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Heng Tao Shen,et al.  Hashing with Angular Reconstructive Embeddings , 2018, IEEE Transactions on Image Processing.

[4]  Zhenan Sun,et al.  Fast Supervised Discrete Hashing , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[6]  Dongqing Zhang,et al.  Large-Scale Supervised Multimodal Hashing with Semantic Correlation Maximization , 2014, AAAI.

[7]  Wei Zhang,et al.  SCRATCH: A Scalable Discrete Matrix Factorization Hashing for Cross-Modal Retrieval , 2018, ACM Multimedia.

[8]  Guiguang Ding,et al.  Collective Matrix Factorization Hashing for Multimodal Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Xuelong Li,et al.  Auto-Weighted Multi-View Learning for Image Clustering and Semi-Supervised Classification , 2018, IEEE Transactions on Image Processing.

[10]  Xin Luo,et al.  Discrete Hashing With Multiple Supervision , 2019, IEEE Transactions on Image Processing.

[11]  Xiang Zhou,et al.  Scalable Zero-Shot Learning via Binary Visual-Semantic Embeddings , 2019, IEEE Transactions on Image Processing.

[12]  Xuelong Li,et al.  Self-weighted Multiview Clustering with Multiple Graphs , 2017, IJCAI.

[13]  Katta G. Murty,et al.  Nonlinear Programming Theory and Algorithms , 2007, Technometrics.

[14]  Mohan S. Kankanhalli,et al.  MMALFM , 2018, ACM Trans. Inf. Syst..

[15]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Louis-Philippe Morency,et al.  Multimodal Machine Learning: A Survey and Taxonomy , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Rui Yang,et al.  Discrete Multi-view Hashing for Effective Image Retrieval , 2017, ICMR.

[18]  Yang Yang,et al.  Adversarial Cross-Modal Retrieval , 2017, ACM Multimedia.

[19]  Zi Huang,et al.  Effective Multiple Feature Hashing for Large-Scale Near-Duplicate Video Retrieval , 2013, IEEE Transactions on Multimedia.

[20]  Fumin Shen,et al.  Multi-view Latent Hashing for Efficient Multimedia Search , 2015, ACM Multimedia.

[21]  Yi Ma,et al.  The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices , 2010, Journal of structural biology.

[22]  Seungjin Choi,et al.  Deep Learning to Hash with Multiple Representations , 2012, 2012 IEEE 12th International Conference on Data Mining.

[23]  Liqiang Nie,et al.  Dual Deep Neural Networks Cross-Modal Hashing , 2018, AAAI.

[24]  Weiwei Liu,et al.  Multiview Discrete Hashing for Scalable Multimedia Search , 2018, ACM Trans. Intell. Syst. Technol..

[25]  Heng Tao Shen,et al.  Exploring Auxiliary Context: Discrete Semantic Transfer Hashing for Scalable Image Retrieval , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Zi Huang,et al.  Discrete Multimodal Hashing With Canonical Views for Robust Mobile Landmark Search , 2017, IEEE Transactions on Multimedia.

[27]  Yilong Yin,et al.  Distribution-Oriented Aesthetics Assessment With Semantic-Aware Hybrid Network , 2019, IEEE Transactions on Multimedia.

[28]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[29]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[30]  Wei Liu,et al.  Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Di Liu,et al.  Compact kernel hashing with multiple features , 2012, ACM Multimedia.

[32]  Weiwei Liu,et al.  Multilabel Prediction via Cross-View Search , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[33]  Xuelong Li,et al.  Discrete Spectral Hashing for Efficient Similarity Retrieval , 2019, IEEE Transactions on Image Processing.

[34]  Heng Tao Shen,et al.  Efficient Supervised Discrete Multi-View Hashing for Large-Scale Multimedia Search , 2020, IEEE Transactions on Multimedia.

[35]  Ke Lu,et al.  Transfer Independently Together: A Generalized Framework for Domain Adaptation , 2019, IEEE Transactions on Cybernetics.

[36]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[37]  Yang Yang,et al.  Graph Convolutional Network Hashing , 2020, IEEE Transactions on Cybernetics.

[38]  Lawrence K. Saul,et al.  Think Globally, Fit Locally: Unsupervised Learning of Low Dimensional Manifold , 2003, J. Mach. Learn. Res..

[39]  Chao Li,et al.  Deep Joint Semantic-Embedding Hashing , 2018, IJCAI.

[40]  Ke Lu,et al.  Heterogeneous Domain Adaptation Through Progressive Alignment , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[41]  Yang Yang,et al.  Attribute hashing for zero-shot image retrieval , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[42]  Jiwen Lu,et al.  Deep Hashing for Scalable Image Search , 2017, IEEE Transactions on Image Processing.

[43]  Yongxin Wang,et al.  SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing , 2018, IJCAI.

[44]  Heng Tao Shen,et al.  Collective Reconstructive Embeddings for Cross-Modal Hashing , 2019, IEEE Transactions on Image Processing.

[45]  G. Sapiro,et al.  A collaborative framework for 3D alignment and classification of heterogeneous subvolumes in cryo-electron tomography. , 2013, Journal of structural biology.

[46]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[47]  Xinbo Gao,et al.  Label Consistent Matrix Factorization Hashing for Large-Scale Cross-Modal Similarity Search , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[49]  Feiping Nie,et al.  Multi-View Clustering and Feature Learning via Structured Sparsity , 2013, ICML.

[50]  Heng Tao Shen,et al.  Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Seungjin Choi,et al.  Multi-view anchor graph hashing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[52]  Peng Zhang,et al.  Nonlinear Dimensionality Reduction by Locally Linear Inlaying , 2009, IEEE Transactions on Neural Networks.

[53]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[54]  Wei Liu,et al.  Classification by Retrieval: Binarizing Data and Classifiers , 2017, SIGIR.

[55]  Jungong Han,et al.  Cross-View Retrieval via Probability-Based Semantics-Preserving Hashing , 2017, IEEE Transactions on Cybernetics.

[56]  Luo Si,et al.  Preference preserving hashing for efficient recommendation , 2014, SIGIR.

[57]  Xin-Shun Xu,et al.  Scalable Supervised Discrete Hashing for Large-Scale Search , 2018, WWW.

[58]  Ling Shao,et al.  Multiview Alignment Hashing for Efficient Image Search , 2015, IEEE Transactions on Image Processing.