Deep Feature Aggregation and Image Re-Ranking With Heat Diffusion for Image Retrieval

Image retrieval based on deep convolutional features has demonstrated state-of-the-art performance in popular benchmarks. In this paper, we present a unified solution to address deep convolutional feature aggregation and image re-ranking by simulating the dynamics of heat diffusion. A distinctive problem in image retrieval is that repetitive or bursty features tend to dominate final image representations, resulting in representations less distinguishable. We show that by considering each deep feature as a heat source, our unsupervised aggregation method is able to avoid over-representation of bursty features. We additionally provide a practical solution for the proposed aggregation method and further show the efficiency of our method in experimental evaluation. Inspired by the aforementioned deep feature aggregation method, we also propose a method to re-rank a number of top ranked images for a given query image by considering the query as the heat source. Finally, we extensively evaluate the proposed approach with pre-trained and fine-tuned deep networks on common public benchmarks and show superior performance compared to previous work.

[1]  Jianru Xue,et al.  Building discriminative CNN image representations for object retrieval using the replicator equation , 2018, Pattern Recognit..

[2]  Michael Donoser Replicator Graph Clustering , 2013, BMVC.

[3]  Yannis Avrithis,et al.  Efficient Diffusion on Region Manifolds: Recovering Small Objects with Compact CNN Representations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[5]  Zi Huang,et al.  Quartet-net Learning for Visual Instance Retrieval , 2016, ACM Multimedia.

[6]  Yiannis Andreopoulos,et al.  Voronoi-Based Compact Image Descriptors: Efficient Region-of-Interest Retrieval With VLAD and Deep-Learning-Based Descriptors , 2016, IEEE Transactions on Multimedia.

[7]  David Stutz,et al.  Neural Codes for Image Retrieval , 2015 .

[8]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Joachim Weickert,et al.  Anisotropic diffusion in image processing , 1996 .

[11]  Horst Bischof,et al.  Diffusion Processes for Retrieval Revisited , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Jianfei Cai,et al.  A diffusion approach to seeded image segmentation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Andrea Vedaldi,et al.  MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[14]  Suk-Ju Kang,et al.  Geodesic Path-Based Diffusion Acceleration for Image Denoising , 2018, IEEE Transactions on Multimedia.

[15]  Yosi Keller,et al.  Improving Shape Retrieval by Spectral Matching and Meta Similarity , 2010, IEEE Transactions on Image Processing.

[16]  Yuxin Peng,et al.  Query-Adaptive Image Retrieval by Deep-Weighted Hashing , 2016, IEEE Transactions on Multimedia.

[17]  Xuelong Hu,et al.  Discriminative saliency propagation with sink points , 2016, Pattern Recognit..

[18]  Ronan Sicre,et al.  Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[19]  Miroslaw Bober,et al.  Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval , 2017, ArXiv.

[20]  Jianru Xue,et al.  Large-scale vocabularies with local graph diffusion and mode seeking , 2018, Signal Process. Image Commun..

[21]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[22]  Nuno Vasconcelos,et al.  Learning Optimal Seeds for Diffusion-Based Salient Object Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Takeo Kanade,et al.  Distributed cosegmentation via submodular optimization on anisotropic diffusion , 2011, 2011 International Conference on Computer Vision.

[24]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[25]  Giuseppe Valenzise,et al.  Keypoint Detection in RGBD Images Based on an Anisotropic Scale Space , 2016, IEEE Transactions on Multimedia.

[26]  C. Schmid,et al.  On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Ryutarou Ohbuchi,et al.  Diffusion-on-Manifold Aggregation of Local Features for Shape-based 3D Model Retrieval , 2015, ICMR.

[29]  Yannis Avrithis,et al.  Fast Spectral Ranking for Similarity Search , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Yannis Avrithis,et al.  To Aggregate or Not to aggregate: Selective Match Kernels for Image Search , 2013, 2013 IEEE International Conference on Computer Vision.

[31]  Chunheng Wang,et al.  Unsupervised Part-Based Weighting Aggregation of Deep Convolutional Features for Image Retrieval , 2017, AAAI.

[32]  Jiri Matas,et al.  Learning Vocabularies over a Fine Quantization , 2013, International Journal of Computer Vision.

[33]  Jianru Xue,et al.  Democratic Diffusion Aggregation for Image Retrieval , 2016, IEEE Transactions on Multimedia.

[34]  Stefan Carlsson,et al.  CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[35]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Longin Jan Latecki,et al.  Locally constrained diffusion process on locally densified distance spaces with applications to shape retrieval , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[37]  Atsuto Maki,et al.  Visual Instance Retrieval with Deep Convolutional Networks , 2014, ICLR.

[38]  Naila Murray,et al.  Interferences in Match Kernels , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Atsuto Maki,et al.  From generic to specific deep representations for visual recognition , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[40]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[41]  Ondrej Chum,et al.  CNN Image Retrieval Learns from BoW: Unsupervised Fine-Tuning with Hard Examples , 2016, ECCV.

[42]  Ngai-Man Cheung,et al.  Selective Deep Convolutional Features for Image Retrieval , 2017, ACM Multimedia.

[43]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Jitendra Malik,et al.  Scale-Space and Edge Detection Using Anisotropic Diffusion , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Ngai-Man Cheung,et al.  Embedding Based on Function Approximation for Large Scale Image Search , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  David A. Shamma,et al.  YFCC100M , 2015, Commun. ACM.

[48]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[49]  Simon Osindero,et al.  Cross-Dimensional Weighting for Aggregated Deep Convolutional Features , 2015, ECCV Workshops.

[50]  Leo Grady,et al.  Random Walks for Image Segmentation , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Hervé Jégou,et al.  Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening , 2012, ECCV.

[52]  V. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[53]  Andrew Zisserman,et al.  Triangulation Embedding and Democratic Aggregation for Image Search , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Bohyung Han,et al.  Large-Scale Image Retrieval with Attentive Deep Local Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[55]  Albert Gordo,et al.  Deep Image Retrieval: Learning Global Representations for Image Search , 2016, ECCV.