论文信息 - Scalar Quantization-Based Text Encoding for Large Scale Image Retrieval

Scalar Quantization-Based Text Encoding for Large Scale Image Retrieval

The great success of visual features learned from deep neural networks has led to a significant effort to develop efficient and scalable technologies for image retrieval. This paper presents an approach to transform neural network features into text codes suitable for being indexed by a standard full-text retrieval engine such as Elasticsearch. The basic idea is providing a transformation of neural network features with the twofold aim of promoting the sparsity without the need of unsupervised pre-training. We validate our approach on a recent convolutional neural network feature, namely Regional Maximum Activations of Convolutions (R-MAC), which is a state-of-art descriptor for image retrieval. An extensive experimental evaluation conducted on standard benchmarks shows the effectiveness and efficiency of the proposed approach and how it compares to state-of-the-art main-memory indexes.

[1] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[2] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Claudio Gennaro,et al. An Approach to Content-Based Image Retrieval Based on the Lucene Search Engine Library , 2010, ECDL.

[4] Albert Gordo,et al. End-to-End Learning of Deep Visual Representations for Image Retrieval , 2016, International Journal of Computer Vision.

[5] Michael McGill,et al. Introduction to Modern Information Retrieval , 1983 .

[6] Tomás Pajdla,et al. NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Zhenfeng Zhu,et al. Indexing of the CNN features for the large scale image search , 2018, Multimedia Tools and Applications.

[8] Larry S. Davis,et al. Exploiting local features from deep networks for image retrieval , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9] Ronan Sicre,et al. Particular object retrieval with integral max-pooling of CNN activations , 2015, ICLR.

[10] Claudio Gennaro,et al. Large-scale instance-level image retrieval , 2020, Inf. Process. Manag..

[11] Cordelia Schmid,et al. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[12] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13] Claudio Gennaro,et al. Combining local and global visual feature similarity using a text search engine , 2011, 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI).

[14] David Stutz,et al. Neural Codes for Image Retrieval , 2015 .

[15] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Noel E. O'Connor,et al. Bags of Local Convolutional Features for Scalable Instance Search , 2016, ICMR.

[17] Cordelia Schmid,et al. Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Claudio Gennaro,et al. Deep Permutations: Deep Convolutional Neural Networks and Permutation-Based Indexing , 2016, SISAP.

[19] Honglak Lee,et al. Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units , 2016, ICML.