Deep Progressive Hashing for Image Retrieval

This paper proposes a novel recursive hashing scheme, in contrast to conventional "one-off" based hashing algorithms. Inspired by human's "nonsalient-to-salient" perception path, the proposed hashing scheme generates a series of binary codes based on progressively expanded salient regions. Built on a recurrent deep network, i.e., LSTM structure, the binary codes generated from later output nodes naturally inherit information aggregated from previously codes while explore novel information from the extended salient region, and therefore it possesses good scalability property. The proposed deep hashing network is trained via minimizing a triplet ranking loss, which is end-to-end trainable. Extensive experimental results on several image retrieval benchmarks demonstrate good performance gain over state-of-the-art image retrieval methods and its scalability property.

[1]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[2]  Shih-Fu Chang,et al.  Hash Bit Selection: A Unified Solution for Selection Problems in Hashing , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[4]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[6]  Chu-Song Chen,et al.  Supervised Learning of Semantics-Preserving Hashing via Deep Neural Networks for Large-Scale Image Search , 2015, ArXiv.

[7]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[8]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[10]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[11]  Yizhou Wang,et al.  Neighborhood-Preserving Hashing for Large-Scale Cross-Modal Search , 2016, ACM Multimedia.

[12]  Kristen Grauman,et al.  Kernelized Locality-Sensitive Hashing , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jingdong Wang,et al.  Binary Optimized Hashing , 2016, ACM Multimedia.

[14]  Tat-Seng Chua,et al.  NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[15]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jiwen Lu,et al.  Deep hashing for compact binary codes learning , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[18]  David Suter,et al.  Fast Supervised Hashing with Decision Trees for High-Dimensional Data , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[20]  Wojciech Zaremba,et al.  Learning to Execute , 2014, ArXiv.

[21]  Kristen Grauman,et al.  Kernelized locality-sensitive hashing for scalable image search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  Sanjiv Kumar,et al.  Learning Binary Codes for High-Dimensional Data Using Bilinear Projections , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Junchi Yan,et al.  Visual Saliency Detection via Sparsity Pursuit , 2010, IEEE Signal Processing Letters.

[24]  James M. Rehg,et al.  The Secrets of Salient Object Segmentation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Dan Zhang,et al.  Learning to Hash with Partial Tags: Exploring Correlation between Tags and Hashing Bits for Large Scale Image Retrieval , 2014, ECCV.

[26]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[27]  Philip S. Yu,et al.  Deep Visual-Semantic Hashing for Cross-Modal Retrieval , 2016, KDD.

[28]  Tieniu Tan,et al.  Deep semantic ranking based hashing for multi-label image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Guosheng Lin,et al.  Learning Hash Functions Using Column Generation , 2013, ICML.

[30]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[31]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[33]  John K. Tsotsos,et al.  Saliency Based on Information Maximization , 2005, NIPS.

[34]  Chao Ma,et al.  Supervised Recurrent Hashing for Large Scale Video Retrieval , 2016, ACM Multimedia.

[35]  Jiwen Lu,et al.  Supervised Discriminative Hashing for Compact Binary Codes , 2014, ACM Multimedia.

[36]  Pietro Perona,et al.  Graph-Based Visual Saliency , 2006, NIPS.

[37]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[38]  Wei Liu,et al.  Learning Hash Codes with Listwise Supervision , 2013, 2013 IEEE International Conference on Computer Vision.