论文信息 - Bundling features for large scale partial-duplicate web image search

Bundling features for large scale partial-duplicate web image search

In state-of-the-art image retrieval systems, an image is represented by a bag of visual words obtained by quantizing high-dimensional local image descriptors, and scalable schemes inspired by text retrieval are then applied for large scale image indexing and retrieval. Bag-of-words representations, however: 1) reduce the discriminative power of image features due to feature quantization; and 2) ignore geometric relationships among visual words. Exploiting such geometric constraints, by estimating a 2D affine transformation between a query image and each candidate image, has been shown to greatly improve retrieval precision but at high computational cost. In this paper we present a novel scheme where image features are bundled into local groups. Each group of bundled features becomes much more discriminative than a single feature, and within each group simple and robust geometric constraints can be efficiently enforced. Experiments in Web image search, with a database of more than one million images, show that our scheme achieves a 49% improvement in average precision over the baseline bag-of-words approach. Retrieval performance is comparable to existing full geometric verification approaches while being much less computationally expensive. When combined with full geometric verification we achieve a 77% precision improvement over the baseline bag-of-words approach, and a 24% improvement over full geometric verification alone.

Jian Sun | M. Isard | Qifa Ke | Zhong Wu

[1] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[2] Cordelia Schmid,et al. Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[3] Pierre Tirilly,et al. Language modeling for bag-of-visual words image categorization , 2008, CIVR '08.

[4] Gang Hua,et al. Integrated feature selection and higher-order spatial feature extraction for object categorization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5] Michael Isard,et al. Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Luc Van Gool,et al. Efficient Mining of Frequent and Distinctive Feature Configurations , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[7] Michael Isard,et al. Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8] James Philbin,et al. Scalable near identical image and shot detection , 2007, CIVR '07.

[9] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10] Ming Yang,et al. Discovery of Collocation Patterns: from Visual Words to Visual Phrases , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Cordelia Schmid,et al. A contextual dissimilarity measure for accurate and efficient image search , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[12] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13] Cordelia Schmid,et al. A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[14] Cordelia Schmid,et al. A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[16] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17] Jiri Matas,et al. Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..