论文信息 - Fusing Feature Distribution Entropy with R-MAC Features in Image Retrieval

Fusing Feature Distribution Entropy with R-MAC Features in Image Retrieval

Image retrieval based on a convolutional neural network (CNN) has attracted great attention among researchers because of the high performance. The pooling method has become a research hotpot in the task of image retrieval in recent years. In this paper, we propose the feature distribution entropy (FDE) to measure the difference of regional distribution information in the feature maps from CNNs. We propose a novel pooling method, which fuses our proposed FDE with region maximum activations of convolutions (R-MAC) features to improve the performance of image retrieval, as it takes the advantage of regional distribution information in the feature maps. Compared with the descriptors computed by R-MAC pooling, our proposed method considers not only the most significant feature values of each region in feature map, but also the distribution difference in different regions. We utilize the histogram of feature values to calculate regional distribution entropy and concatenate the regional distribution entropy into FDE, which is further normalized and fused with R-MAC feature vectors by weighted summation to generate the final feature descriptors. We have conducted experiments on public datasets and the results demonstrate that our proposed method could produce better retrieval performances than existing state-of-the-art algorithms. Further, higher performance could be achieved by performing these post-processing on the improved feature descriptors.

[1] Marcel Worring,et al. Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3] Gonzalo Pajares,et al. Computational Intelligence in Image Processing 2018 , 2013, Mathematical Problems in Engineering.

[4] Svetlana Lazebnik,et al. Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[5] Simon Osindero,et al. Cross-Dimensional Weighting for Aggregated Deep Convolutional Features , 2015, ECCV Workshops.

[6] Frédéric Jurie,et al. Modeling spatial layout with fisher vectors for image categorization , 2011, 2011 International Conference on Computer Vision.

[7] Ashish Mohan Yadav,et al. A Survey on Content Based Image Retrieval Systems , 2014 .

[8] Hervé Jégou,et al. Negative Evidences and Co-occurences in Image Retrieval: The Benefit of PCA and Whitening , 2012, ECCV.

[9] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Zhuang Miao,et al. Adding spatial distribution clue to aggregated vector in image retrieval , 2018, EURASIP Journal on Image and Video Processing.

[12] Ronan Sicre,et al. Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[13] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[14] Albert Gordo,et al. Beyond Instance-Level Image Retrieval: Leveraging Captions to Learn a Global Visual Representation for Semantic Retrieval , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Krystian Mikolajczyk,et al. Spatial Coordinate Coding to reduce histogram representations, Dominant Angle and Colour Pyramid Match , 2011, 2011 18th IEEE International Conference on Image Processing.

[16] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17] Sang Joon Kim,et al. A Mathematical Theory of Communication , 2006 .

[18] Jiri Matas,et al. Total recall II: Query expansion revisited , 2011, CVPR 2011.

[19] Victor S. Lempitsky,et al. Aggregating Deep Convolutional Features for Image Retrieval , 2015, ArXiv.

[20] Tien Yin Wong,et al. ORIGA-light: An online retinal fundus image database for glaucoma analysis and research , 2010, 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology.

[21] Tomás Pajdla,et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[23] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[24] Jun Cheng,et al. brain tumor dataset , 2016 .

[25] Jana Kosecka,et al. Deep Convolutional Features for Image Based Retrieval and Scene Categorization , 2015, ArXiv.

[26] Michael Isard,et al. Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27] Cordelia Schmid,et al. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[28] Michael Isard,et al. Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[29] Cordelia Schmid,et al. Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Larry S. Davis,et al. Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[31] Jing Dong,et al. MFC: A multi-scale fully convolutional approach for visual instance retrieval , 2017, 2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[32] Andrea Vedaldi,et al. MatConvNet: Convolutional Neural Networks for MATLAB , 2014, ACM Multimedia.

[33] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34] Noel E. O'Connor,et al. Bags of Local Convolutional Features for Scalable Instance Search , 2016, ICMR.

[35] George Papandreou,et al. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation , 2018, ECCV.

[36] Victor S. Lempitsky,et al. Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37] Qi Tian,et al. SIFT Meets CNN: A Decade Survey of Instance Retrieval , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39] Atsuto Maki,et al. Visual Instance Retrieval with Deep Convolutional Networks , 2014, ICLR.

[40] Abbes Amira,et al. Content-based image retrieval with compact deep convolutional features , 2017, Neurocomputing.

[41] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[42] Zahid Mehmood,et al. A Novel Image Retrieval Based on a Combination of Local and Global Histograms of Visual Words , 2016 .

[43] Xiaoyin Duanmu,et al. Image Retrieval Using Color Moment Invariant , 2010, 2010 Seventh International Conference on Information Technology: New Generations.

[44] Florent Perronnin,et al. Modeling the spatial layout of images beyond spatial pyramids , 2012, Pattern Recognit. Lett..

[45] Jian Sun,et al. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46] Giorgos Tolias,et al. Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[48] Jiri Matas,et al. Improving Descriptors for Fast Tree Matching by Optimal Linear Projection , 2007, 2007 IEEE 11th International Conference on Computer Vision.