SitNet: Discrete Similarity Transfer Network for Zero-shot Hashing

Hashing has been widely utilized for fast image retrieval recently. With semantic information as supervision, hashing approaches perform much better, especially when combined with deep convolution neural network(CNN). However, in practice, new concepts emerge every day, making collecting supervised information for re-training hashing model infeasible. In this paper, we propose a novel zero-shot hashing approach, called Discrete Similarity Transfer Network (SitNet), to preserve the semantic similarity between images from both “seen” concepts and new “unseen” concepts. Motivated by zero-shot learning, the semantic vectors of concepts are adopted to capture the similarity structures among classes, making the model trained with seen concepts generalize well for unseen ones. We adopt a multi-task architecture to exploit the supervised information for seen concepts and the semantic vectors simultaneously. Moreover, a discrete hashing layer is integrated into the network for hashcode generating to avoid the information loss caused by real-value relaxation in training phase, which is a critical problem in existing works. Experiments on three benchmarks validate the superiority of SitNet to the state-of-the-arts.

[1]  Jungong Han,et al.  Cross-View Retrieval via Probability-Based Semantics-Preserving Hashing , 2017, IEEE Transactions on Cybernetics.

[2]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[3]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jungong Han,et al.  Robust Iterative Quantization for Efficient ℓp-norm Similarity Search , 2016, IJCAI.

[5]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[6]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[7]  Yang Yang,et al.  Zero-Shot Hashing via Transferring Supervised Knowledge , 2016, ACM Multimedia.

[8]  Wei-Lun Chao,et al.  Synthesized Classifiers for Zero-Shot Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[10]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[11]  Yoshua Bengio,et al.  Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation , 2013, ArXiv.

[12]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ling Shao,et al.  Learning to Hash With Optimized Anchor Embedding for Scalable Retrieval , 2017, IEEE Transactions on Image Processing.

[15]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[18]  Jianmin Wang,et al.  Deep Hashing Network for Efficient Similarity Retrieval , 2016, AAAI.

[19]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Bernt Schiele,et al.  Multi-cue Zero-Shot Learning with Strong Supervision , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Wei Liu,et al.  Supervised Discrete Hashing , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Heng Tao Shen,et al.  Hashing on Nonlinear Manifolds , 2014, IEEE Transactions on Image Processing.

[23]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[24]  Wei Liu,et al.  Learning to Hash for Indexing Big Data—A Survey , 2015, Proceedings of the IEEE.

[25]  Yue Gao,et al.  Large-Scale Cross-Modality Search via Collective Matrix Factorization Hashing , 2016, IEEE Transactions on Image Processing.

[26]  Yue Gao,et al.  Zero-Shot Learning With Transferred Samples , 2017, IEEE Transactions on Image Processing.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Georgiana Dinu,et al.  Hubness and Pollution: Delving into Cross-Space Mapping for Zero-Shot Learning , 2015, ACL.

[29]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.