Feature Pyramid Hashing

In recent years, deep-networks-based hashing has become a leading approach for large-scale image retrieval. Most deep hashing approaches use the high layer to extract the powerful semantic representations. However, these methods have limited ability for fine-grained image retrieval because the semantic features extracted from the high layer are difficult in capturing the subtle differences. To this end, we propose a novel two-pyramid hashing architecture to learn both the semantic information and the subtle appearance details for fine-grained image search. Inspired by the feature pyramids of convolutional neural network, avertical pyramid is proposed to capture the high-layer features and ahorizontal pyramid combines multiple low-layer features with structural information to capture the subtle differences. To fuse the low-level features, a novel combination strategy, called consensus fusion, is proposed to capture all subtle information from several low-layers for finer retrieval. Extensive evaluation on two fine-grained datasets CUB-200-2011 and Stanford Dogs demonstrate that the proposed method achieves significant performance compared with the state-of-art baselines.

[1]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[2]  Nicu Sebe,et al.  A Survey on Learning to Hash , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sheng Jin Deep Saliency Hashing , 2018, ArXiv.

[5]  Hongxun Yao,et al.  Hierarchical semantic image matching using CNN feature pyramid , 2018, Comput. Vis. Image Underst..

[6]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[7]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[8]  Rongrong Ji,et al.  Supervised hashing with kernels , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Geoffrey E. Hinton,et al.  Learning a Nonlinear Embedding by Preserving Class Neighbourhood Structure , 2007, AISTATS.

[10]  Jianmin Wang,et al.  Deep Cauchy Hashing for Hamming Space Retrieval , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[12]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[14]  Shih-Fu Chang,et al.  Sequential Projection Learning for Hashing with Compact Codes , 2010, ICML.

[15]  Tao Mei,et al.  Deep Semantic Hashing with Generative Adversarial Networks , 2017, SIGIR.

[16]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[17]  Hanjiang Lai,et al.  Supervised Hashing for Image Retrieval via Image Representation Learning , 2014, AAAI.

[18]  Gang Yu,et al.  Cascaded Pyramid Network for Multi-person Pose Estimation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Heng Tao Shen,et al.  Unsupervised Deep Hashing with Similarity-Adaptive and Discrete Optimization , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  David Suter,et al.  A General Two-Step Approach to Learning-Based Hashing , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Chun Chen,et al.  Semi-Supervised Nonlinear Hashing Using Bootstrap Sequential Projection Learning , 2013, IEEE Transactions on Knowledge and Data Engineering.

[23]  Fuchun Sun,et al.  Deep Feature Pyramid Reconfiguration for Object Detection , 2018, ECCV.

[24]  Shiguang Shan,et al.  Deep Supervised Hashing for Fast Image Retrieval , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Xi Zhang,et al.  Attention-Aware Deep Adversarial Hashing for Cross-Modal Retrieval , 2017, ECCV.

[26]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[27]  Wu-Jun Li,et al.  Feature Learning Based Deep Supervised Hashing with Pairwise Labels , 2015, IJCAI.

[28]  Ian D. Reid,et al.  Fast Training of Triplet-Based Deep Binary Embedding Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[30]  Jianping Fan,et al.  Spatial pyramid deep hashing for large-scale image retrieval , 2017, Neurocomputing.

[31]  Qiaosong Wang,et al.  Adversarial Learning for Fine-Grained Image Search , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[32]  Luo Si,et al.  Ranking Preserving Hashing for Fast Similarity Search , 2015, IJCAI.

[33]  Trevor Darrell,et al.  Learning to Hash with Binary Reconstructive Embeddings , 2009, NIPS.

[34]  David J. Fleet,et al.  Minimal Loss Hashing for Compact Binary Codes , 2011, ICML.

[35]  Shih-Fu Chang,et al.  Semi-supervised hashing for scalable image retrieval , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[36]  Hanjiang Lai,et al.  Simultaneous feature learning and hash coding with deep neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Philip S. Yu,et al.  HashNet: Deep Learning to Hash by Continuation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).