SemanticHash: Hash Coding Via Semantics-Guided Label Prototype Learning

In this article, we propose SemanticHash, a simple and effective deep neural network model, to leverage semantic word embeddings (e.g., BERT) in hash codes learning. Both images and class labels are compressed into $K$-bit binary vectors by using the visual (or the semantic) hash functions, which are jointly learned and aligned to optimize the semantic consistency. The $K$-dimensional class label prototypes—projected from semantic word embeddings—guide the hash mapping on the image side and vice versa, creating the $K$-bit image hash codes being aligned with their semantic prototypes and therefore more discriminative. Extensive experimental results on four benchmarks, CIFAR10, NUS-WIDE, ImageNet, and MS-COCO datasets, demonstrate the effectiveness of our approach. We also perform studies to analyze the effects of quantization and word semantic spaces and to explain the relations among the learned class prototypes. Finally, the generalization capability of the proposed approach is further demonstrated. It achieves competitive performance in comparison with state-of-the-art unsupervised and zero-shot hashing methods. Impact Statement—Hash code learning is an important technology that enables efficient image retrieval on large-scale data. While existing hashing algorithms can effectively generate compact binary codes in a supervised learning setting trained with a moderate-size dataset, they are demanding to be scalable to large datasets and do not generalize to unseen datasets. The proposed approach overcomes these limitations. Compared with state-of-the-art ones, our solution achieves 2.1% of average performance improvement on four moderate-size benchmarks and 4.7% of improvement on ImageNet, a large-scale dataset with over 1.2 M training images. With superior performance on popular benchmarks for binary hash code learning, the technology introduced performs well on cross-dataset and zero-shot (i.e., the testing concepts are unseen during training) scenarios too. Our approach attains over 17.7% of zero-shot retrieval performance improvement when compared to the state-of-the-art in the area. This article thus provides a powerful solution to utilize massive data for fast and accurate image retrieval in the big data era.

[1]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[3]  Jen-Hao Hsiao,et al.  Deep learning of binary hash codes for fast image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[5]  Zhi-Hua Zhou,et al.  Column Sampling Based Discrete Supervised Hashing , 2016, AAAI.

[6]  Yang Yang,et al.  Zero-Shot Hashing via Transferring Supervised Knowledge , 2016, ACM Multimedia.

[7]  Farzin Aghdasi,et al.  Vehicle Re-identification: an Efficient Baseline Using Triplet Embedding , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[8]  Jianmin Wang,et al.  Deep Cauchy Hashing for Hamming Space Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[10]  David Stutz,et al.  Neural Codes for Image Retrieval , 2015 .

[11]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[12]  Qi Qian,et al.  SoftTriple Loss: Deep Metric Learning Without Triplet Sampling , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ladan Tahvildari,et al.  Deep Spherical Quantization for Image Search , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jianmin Wang,et al.  Deep Visual-Semantic Quantization for Efficient Image Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Le Song,et al.  Stochastic Generative Hashing , 2017, ICML.

[19]  Wu-Jun Li,et al.  Asymmetric Deep Supervised Hashing , 2017, AAAI.

[20]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[21]  Tieniu Tan,et al.  Deep Supervised Discrete Hashing , 2017, NIPS.

[22]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Fumin Shen,et al.  Inductive Hashing on Manifolds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Stan Sclaroff,et al.  Hashing with Mutual Information , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Philip S. Yu,et al.  Maximum-Margin Hamming Hashing , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[27]  Chu-Song Chen,et al.  Adaptive Labeling for Deep Learning to Hash , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[28]  Ran He,et al.  Neurons Merging Layer: Towards Progressive Redundancy Reduction for Deep Supervised Hashing , 2018, IJCAI.

[29]  Yale Song,et al.  Improving Pairwise Ranking for Multi-label Image Classification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[32]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[33]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[34]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Heng Tao Shen,et al.  Beyond Product Quantization: Deep Progressive Quantization for Image Retrieval , 2019, IJCAI.

[36]  Wei Liu,et al.  Semantic Structure-based Unsupervised Deep Hashing , 2018, IJCAI.

[37]  Jianmin Wang,et al.  Deep Quantization Network for Efficient Image Retrieval , 2016, AAAI.

[38]  Wai Keung Wong,et al.  Deep Supervised Hashing With Anchor Graph , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[39]  Yair Movshovitz-Attias,et al.  No Fuss Distance Metric Learning Using Proxies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[40]  Yi Shi,et al.  Deep Supervised Hashing with Triplet Labels , 2016, ACCV.

[41]  Junjie Chen,et al.  Similarity Preserving Deep Asymmetric Quantization for Image Retrieval , 2019, AAAI.

[42]  Wu-Jun Li,et al.  Feature Learning Based Deep Supervised Hashing with Pairwise Labels , 2015, IJCAI.

[43]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[44]  Wu-Jun Li,et al.  Deep Discrete Supervised Hashing , 2017, IEEE Transactions on Image Processing.

[45]  Matthijs Douze,et al.  How should we evaluate supervised hashing? , 2016, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[46]  Jing Liu,et al.  Deep Incremental Hashing Network for Efficient Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Minyi Guo,et al.  Supervised hashing with latent factor models , 2014, SIGIR.

[48]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[49]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Philip S. Yu,et al.  HashNet: Deep Learning to Hash by Continuation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[51]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[52]  Yuxin Peng,et al.  SSDH: Semi-Supervised Deep Hashing for Large Scale Image Retrieval , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[53]  Shiguang Shan,et al.  Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[55]  Hao-Yu Wu,et al.  Classification is a Strong Baseline for Deep Metric Learning , 2018, BMVC.

[56]  Heng Tao Shen,et al.  Deep Recurrent Quantization for Generating Sequential Binary Codes , 2019, IJCAI.

[57]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[58]  Kai Han,et al.  Greedy Hash: Towards Fast Optimization for Accurate Hash Coding in CNN , 2018, NeurIPS.

[59]  Jiwen Lu,et al.  Deep Hashing via Discrepancy Minimization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Dacheng Tao,et al.  DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Jiwen Lu,et al.  Relaxation-Free Deep Hashing via Policy Gradient , 2018, ECCV.

[62]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[63]  Philip S. Yu,et al.  Deep Priority Hashing , 2018, ACM Multimedia.