Indexing Large Visual Vocabulary by Randomized Dimensions Hashing for High Quantization Accuracy: Improving the Object Retrieval Quality

The bag-of-visual-words approach, inspired by text retrieval methods, has proven successful in achieving high performance in object retrieval on large-scale databases. A key step of these methods is the quantization stage which maps the high-dimensional image feature vectors to discriminatory visual words. In this paper, we consider the quantization step as the nearest neighbor search in large visual vocabulary, and thus proposed a randomized dimensions hashing (RDH) algorithm to efficiently index and search the large visual vocabulary. The experimental results have demonstrated that the proposed algorithm can effectively increase the quantization accuracy compared to the vocabulary tree based methods which represent the state-of-the-art. Consequently, the object retrieval performance can be significantly improved by our method in the large-scale database.

[1]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Bernt Schiele,et al.  Multiple Object Class Detection with a Generative Model , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[7]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).