Similarity-preserving hashing based on deep neural networks for large-scale image retrieval

Abstract Similarity-preserving hashing has become the mainstream of approximate nearest neighbor (ANN) search for large-scale image retrieval. Recent research shows that deep neural networks can produce efficient feature representation. Most existing deep hashing schemes simply utilize the middle-layer features of the deep neural networks to measure the similarity between query images and database images. However, these visual features are suboptimal for discriminating the semantic information of images, especially for complex images that contain multiple objects. In this paper, a deep framework is employed to learn multi-level non-linear transformations to obtain advanced image features, and then we combine these intermediate features and top layer visual information to implement image retrieval. Three criterions are enforced on these compact codes: (1) minimal quantization loss; (2) evenly distributed binary; (3) independent bits. The experimental results on five public large-scale datasets demonstrate the superiority of our method compared with several other state-of-the-art methods.

[1]  Shuicheng Yan,et al.  Scale-Aware Fast R-CNN for Pedestrian Detection , 2015, IEEE Transactions on Multimedia.

[2]  Antonio Torralba,et al.  Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope , 2001, International Journal of Computer Vision.

[3]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[4]  Mark J. Huiskes,et al.  The MIR flickr retrieval evaluation , 2008, MIR '08.

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[7]  Wenwu Zhu,et al.  Deep Multimodal Hashing with Orthogonal Regularization , 2015, IJCAI.

[8]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[9]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Wei Liu,et al.  Learning to Hash for Indexing Big Data—A Survey , 2015, Proceedings of the IEEE.

[11]  Dumitru Erhan,et al.  Deep Neural Networks for Object Detection , 2013, NIPS.

[12]  Xiaogang Wang,et al.  Hybrid Deep Learning for Face Verification , 2013, 2013 IEEE International Conference on Computer Vision.

[13]  Nitish Srivastava,et al.  Multimodal learning with deep Boltzmann machines , 2012, J. Mach. Learn. Res..

[14]  Xiaogang Wang,et al.  Joint Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[16]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[17]  Xiaogang Wang,et al.  Deep Learning Face Representation by Joint Identification-Verification , 2014, NIPS.

[18]  Rongrong Ji,et al.  Top Rank Supervised Binary Coding for Visual Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  A. Ben Hamza,et al.  Spectral shape classification: A deep learning approach , 2017, J. Vis. Commun. Image Represent..

[20]  Xianglong Liu,et al.  Distributed Adaptive Binary Quantization for Fast Nearest Neighbor Search , 2017, IEEE Transactions on Image Processing.

[21]  Wu-Jun Li,et al.  Feature Learning Based Deep Supervised Hashing with Pairwise Labels , 2015, IJCAI.

[22]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[24]  Lei Huang,et al.  Query-Adaptive Hash Code Ranking for Large-Scale Multi-View Visual Search , 2016, IEEE Transactions on Image Processing.

[25]  Lei Zhang,et al.  Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification , 2015, IEEE Transactions on Image Processing.

[26]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[27]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Svetlana Lazebnik,et al.  Locality-sensitive binary codes from shift-invariant kernels , 2009, NIPS.

[29]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[30]  Zhi Zhang,et al.  Fast Deep Neural Networks With Knowledge Guided Training and Predicted Regions of Interests for Real-Time Video Object Detection , 2018, IEEE Access.

[31]  Jen-Hao Hsiao,et al.  Deep learning of binary hash codes for fast image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[32]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[33]  Yadong Mu,et al.  Large-scale multi-task image labeling with adaptive relevance discovery and feature hashing , 2015, Signal Process..

[34]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[35]  Chu-Song Chen,et al.  Supervised Learning of Semantics-Preserving Hash via Deep Convolutional Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[37]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[38]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Chao Li,et al.  Shared Predictive Cross-Modal Deep Quantization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Xianglong Liu,et al.  Adaptive multi-bit quantization for hashing , 2015, Neurocomputing.

[42]  Ji Wan,et al.  Deep Learning for Content-Based Image Retrieval: A Comprehensive Study , 2014, ACM Multimedia.

[43]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[46]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[47]  Xinbo Gao,et al.  Triplet-Based Deep Hashing Network for Cross-Modal Retrieval , 2018, IEEE Transactions on Image Processing.

[48]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[50]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[51]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[52]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[53]  Rongrong Ji,et al.  Weakly Supervised Multi-Graph Learning for Robust Image Reranking , 2014, IEEE Transactions on Multimedia.

[54]  Ning Xia,et al.  Deep r -th Root of Rank Supervised Joint Binary Embedding for Multivariate Time Series Retrieval , 2018, KDD.

[55]  Wei Liu,et al.  Discriminative Dictionary Learning With Common Label Alignment for Cross-Modal Retrieval , 2016, IEEE Transactions on Multimedia.

[56]  Xiaofeng Gu,et al.  A Secure Face-Verification Scheme Based on Homomorphic Encryption and Deep Neural Networks , 2017, IEEE Access.

[57]  Wei Liu,et al.  Fast Structural Binary Coding , 2016, IJCAI.

[58]  Lei Dai,et al.  Salient object detection via a local and global method based on deep residual network , 2018, J. Vis. Commun. Image Represent..