Scalability of local image descriptors: a comparative study

Computer vision researchers have recently proposed several local descriptor schemes. Due to lack of database support, however, these descriptors have only been evaluated using small image collections. Recently, we have developed the PvS-framework, which allows efficient querying of large local descriptor collections. In this paper, we use the PvSframework to study the scalability of local image descriptors. We propose a new local descriptor scheme and compare it to three other well known schemes. Using a collection of almost thirty thousand images, we show that the new scheme gives the best results in almost all cases. We then give two stop rules to reduce query processing time and show that in many cases only a few query descriptors must be processed to find matching images. Finally, we test our descriptors on a collection of over three hundred thousand images, resulting in over 200 million local descriptors, and show that even at such a large scale the results are still of high quality, with no change in query processing time.

[1]  Andrew P. Witkin,et al.  Scale-Space Filtering , 1983, IJCAI.

[2]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[3]  Mads Nielsen,et al.  The Hausdorff Dimension and Scale-Space Normalisation of Natural Images , 1999, Scale-Space.

[4]  Cordelia Schmid,et al.  Matching images with different resolutions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  Mads Nielsen,et al.  The Hausdorff Dimension and Scale-Space Normalization of Natural Images , 2000, J. Vis. Commun. Image Represent..

[6]  Patrick Gros,et al.  Content-based Retrieval Using Local Descriptors: Problems and Issues from a Database Perspective , 2001, Pattern Analysis & Applications.

[7]  Nazim Fatès,et al.  Public automated web-based evaluation service for watermarking schemes: StirMark benchmark , 2001, IS&T/SPIE Electronic Imaging.

[8]  Matthew A. Brown,et al.  Invariant Features from Interest Point Groups , 2002, BMVC.

[9]  Olivier Buisson,et al.  Robust Content-Based Video Copy Identification in a Large Reference Database , 2003, CIVR.

[10]  Ronald Fagin,et al.  Efficient similarity search and classification via rank aggregation , 2003, SIGMOD '03.

[11]  Patrick Gros,et al.  Approximate searches: k-neighbors + precision , 2003, CIKM '03.

[12]  Patrick Gros,et al.  Robust content-based image searches for copyright protection , 2003, MMDB '03.

[13]  Yan Ke,et al.  An efficient parts-based near-duplicate and sub-image retrieval system , 2004, MULTIMEDIA '04.

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  R. Sukthankar,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[16]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[17]  Yan Ke,et al.  Efficient Near-duplicate Detection and Sub-image Retrieval , 2004 .

[18]  Laurent Amsaleg,et al.  Efficient and Effective Image Copyright Enforcement , 2005, BDA.

[19]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Max A. Viergever,et al.  General intensity transformations and differential invariants , 1994, Journal of Mathematical Imaging and Vision.

[21]  Horst Bischof,et al.  Fast Approximated SIFT , 2006, ACCV.