Ratio Voting: A New Voting Strategy for Large-Scale Image Retrieval

We propose a new voting strategy referred to as ratio voting to improve bag-of-visual words-based image retrieval. It limits the number of votes in proportion to the number of features in visual words, while conventional schemes use (estimated) distances or rank information as a filtering criterion. Ratio voting realizes adaptive thresholding that captures the density of feature vectors. In experiments, we adopt two different distance estimation methods in the post-filtering step and show that ratio voting achieves a considerable improvement in spite of its simplicity in both cases. Furthermore, we perform exhaustive experiments in combining ratio voting with multiple assignment approaches and show that choosing a multiple assignment approach also has a remarkable impact on accuracy.

[1]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[4]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[5]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[6]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[7]  Jonathan Brandt,et al.  Transform coding for fast approximate nearest neighbor search in high dimensions , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[9]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[12]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[13]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[14]  Yusuke Uchida,et al.  Accurate content-based video copy detection with efficient feature indexing , 2011, ICMR.

[15]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.