Efficient and Distinct Large Scale Bags of Words

Due to the increasing flood of digital images and the overall increase of storage capacity, large scale image databases are common these days. Managing such a vast number of digital images is not trivial. This work deals with the problem of finding replicas in image databases containing more than 100 000 images. The bag of visual words method, in analogy to the bag of words method in text search applications, is examined in detail. Every image can be seen as a set of visual words. Computation of these visual words requires clustering of a huge amount of data in a fast yet accurate way. This is the most time consuming part in such an application. A clustering algorithm is developed that has linear runtime and can be carried out in parallel. We observe that with increasing size of the database, the problem of decreasing discrimination between high frequency images arises. Features of images with natural repetitive texture become similar to other images and show up in most of the search results. This problem is addressed by developing an asymmetric Hamming distance measurement for bags of visual words. It allows better discrimination power in large databases, while being robust to image transformations such as rotation, crop, or change of resolution and size.

[1]  Andrew Zisserman,et al.  Near Duplicate Image Detection: min-Hash and tf-idf Weighting , 2008, BMVC.

[2]  Koen E. A. van de Sande,et al.  Evaluating Color Descriptors for Object and Scene Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Frédéric Jurie,et al.  Fast Discriminative Visual Codebooks using Randomized Clustering Forests , 2006, NIPS.

[4]  Panu Turcot,et al.  Better matching with fewer features: The selection of useful features in large database recognition problems , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[5]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  S. H. Srinivasan,et al.  Finding near-duplicate images on the web using fingerprints , 2008, ACM Multimedia.

[7]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Touradj Ebrahimi,et al.  A Novel Replica Detection System using Binary Classifiers, R-Trees, and PCA , 2006, 2006 International Conference on Image Processing.

[11]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[12]  Stefano Soatto,et al.  Relevant Feature Selection for Human Pose Estimation and Localization in Cluttered Images , 2008, ECCV.

[13]  Frédéric Jurie,et al.  Creating efficient codebooks for visual recognition , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[14]  Hans-Peter Kriegel,et al.  Fast nearest neighbor search in high-dimensional space , 1998, Proceedings 14th International Conference on Data Engineering.

[15]  Cordelia Schmid,et al.  Vector Quantizing Feature Space with a Regular Lattice , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Bernt Schiele,et al.  Multiple Object Class Detection with a Generative Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[18]  Cordelia Schmid,et al.  Selection of scale-invariant parts for object class recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[19]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .