Learning a Fine Vocabulary

A novel similarity measure for bag-of-words type large scale image retrieval is presented. The similarity function is learned in an unsupervised manner, requires no extra space over the standard bag-of-words method and is more discriminative than both L2-based soft assignment and Hamming embedding. We show experimentally that the novel similarity function achieves mean average precision that is superior to any result published in the literature on a number of standard datasets. At the same time, retrieval with the proposed similarity function is faster than the reference method.

[1]  Gordon F. Royle,et al.  Algebraic Graph Theory , 2001, Graduate texts in mathematics.

[2]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Luc Van Gool,et al.  Edinburgh Research Explorer Simultaneous Object Recognition and Segmentation by Image Exploration , 2022 .

[5]  Antonio Torralba,et al.  Building the gist of a scene: the role of global image features in recognition. , 2006, Progress in brain research.

[6]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Jiri Matas,et al.  Improving Descriptors for Fast Tree Matching by Optimal Linear Projection , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[9]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[11]  Andrew J. Davison,et al.  Active Matching , 2008, ECCV.

[12]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[13]  Michal Perdoch,et al.  Efficient sequential correspondence selection by cosegmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[16]  C. Schmid,et al.  On the burstiness of visual elements , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[18]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[19]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[20]  Jiri Matas,et al.  Large-Scale Discovery of Spatially Related Images , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Jan-Michael Frahm,et al.  Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs , 2008, International Journal of Computer Vision.