An efficient key point quantization algorithm for large scale image retrieval

We focus on the problem of large-scale near duplicate image retrieval. Recent studies have shown that local image features, often referred to as key points, are effective for near duplicate image retrieval. The most popular approach for key point based image matching is the clustering-based bag-of-words model. It maps each key point to a visual word in a code-book that is constructed by a clustering algorithm, and represents each image by a histogram of visual words. Despite its success, there are two main shortcomings of the clustering-based bag-of-words model: (i) it is computationally expensive to cluster millions of key points into thousands of visual words; (ii) there is no theoretical analysis on the performance of the bag-of-words model. We propose a new scheme for key point quantization that addresses these shortcomings. Instead of clustering, the proposed scheme quantizes each key point into a binary vector using a collection of randomly generated hyper-spheres, and a bag-of-words model is constructed based on such randomized quantization. Our theoretical analysis shows that the resulting image similarity provides an upper bound for the similarity based on the optimal partial matching between two sets of key points. Empirical study on a database of 100,000 images shows that the proposed scheme is not only more efficient but also more effective than the clustering-based approach for near duplicate image retrieval.

[1]  Andrew Zisserman,et al.  Near Duplicate Image Detection: min-Hash and tf-idf Weighting , 2008, BMVC.

[2]  H Moon,et al.  Computational and Performance Aspects of PCA-Based Face-Recognition Algorithms , 2001, Perception.

[3]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[5]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Richard Hartley,et al.  Localisation using an image-map , 2004 .

[7]  Siwei Lyu,et al.  Mercer kernels for object recognition with local features , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[9]  Barbara Caputo,et al.  Recognition with local features: the kernel recipe , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[10]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[11]  Lior Wolf,et al.  Learning over Sets using Kernel Principal Angles , 2003, J. Mach. Learn. Res..

[12]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Stephen E. Robertson,et al.  Okapi at TREC-7: Automatic Ad Hoc, Filtering, VLC and Interactive , 1998, TREC.

[14]  Ting Liu,et al.  Clustering Billions of Images with Large Scale Nearest Neighbor Search , 2007, 2007 IEEE Workshop on Applications of Computer Vision (WACV '07).

[15]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[16]  Tony Jebara,et al.  A Kernel Between Sets of Vectors , 2003, ICML.

[17]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[18]  Jean-Philippe Tarel,et al.  Non-Mercer Kernels for SVM Object Recognition , 2004, BMVC.

[19]  Trevor Darrell,et al.  Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing) , 2006 .

[20]  Bin Wang,et al.  Large-Scale Duplicate Detection for Web Image Search , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[21]  Yan Ke,et al.  An efficient parts-based near-duplicate and sub-image retrieval system , 2004, MULTIMEDIA '04.

[22]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Trevor Darrell,et al.  The Pyramid Match Kernel: Efficient Learning with Sets of Features , 2007, J. Mach. Learn. Res..

[24]  Yan Ke,et al.  Efficient Near-duplicate Detection and Sub-image Retrieval , 2004 .

[25]  A.K. Jain,et al.  Scars, marks and tattoos (SMT): Soft biometric for suspect and victim identification , 2008, 2008 Biometrics Symposium.

[26]  Nuno Vasconcelos,et al.  A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.

[27]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[28]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[29]  David G. Stork,et al.  Pattern Classification , 1973 .

[30]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[31]  Vincent Lepetit,et al.  Randomized trees for real-time keypoint recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[32]  Zhe Wang,et al.  Efficiently matching sets of features with random histograms , 2008, ACM Multimedia.

[33]  A B WALLACE,et al.  Scars. , 2019, Nursing times.