Efficient Feature Detection and Effective Post-Verification for Large Scale Near-Duplicate Image Search

State-of-the-art near-duplicate image search systems mostly build on the bag-of-local features (BOF) representation. While favorable for simplicity and scalability, these systems have three shortcomings: 1) high time complexity of the local feature detection; 2) discriminability reduction of local descriptors due to BOF quantization; and 3) neglect of the geometric relationships among local features after BOF representation. To overcome these shortcomings, we propose a novel framework by using graphics processing units (GPU). The main contributions of our method are: 1) a new fast local feature detector coined Harris-Hessian (H-H) is designed according to the characteristics of GPU to accelerate the local feature detection; 2) the spatial information around each local feature is incorporated to improve its discriminability, supplying semi-local spatial coherent verification (LSC); and 3) a new pairwise weak geometric consistency constraint (P-WGC) algorithm is proposed to refine the search result. Additionally, part of the system is implemented on GPU to improve efficiency. Experiments conducted on reference datasets and a dataset of one million images demonstrate the effectiveness and efficiency of H-H, LSC, and P-WGC.

[1]  Lars Bretzner,et al.  Real-Time Scale Selection in Hybrid Multi-scale Representations , 2003, Scale-Space.

[2]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[3]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Shree K. Nayar,et al.  Ordinal Measures for Image Correspondence , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Luc Van Gool,et al.  Edinburgh Research Explorer Simultaneous Object Recognition and Segmentation by Image Exploration , 2022 .

[7]  Cordelia Schmid,et al.  A contextual dissimilarity measure for accurate and efficient image search , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Andrew Zisserman,et al.  Efficient Visual Search of Videos Cast as Text Retrieval , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yongdong Zhang,et al.  GPU-based fast scale invariant interest point detector , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[11]  Hung-Khoon Tan,et al.  Real-Time Near-Duplicate Elimination for Web Video Search With Content and Context , 2009, IEEE Transactions on Multimedia.

[12]  Chong-Wah Ngo,et al.  On the Annotation of Web Videos by Efficient Near-Duplicate Search , 2010, IEEE Transactions on Multimedia.

[13]  Jitendra Malik,et al.  Shape matching and object recognition using shape contexts , 2010, 2010 3rd International Conference on Computer Science and Information Technology.

[14]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[17]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[18]  Michael Isard,et al.  Bundling features for large scale partial-duplicate web image search , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Gustavo Carneiro,et al.  Flexible Spatial Configuration of Local Image Features , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  O. Chum,et al.  Geometric min-Hashing: Finding a (thick) needle in a haystack , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Tony Lindeberg,et al.  Scale-Space for Discrete Signals , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Joachim M. Buhmann,et al.  Path-Based Clustering for Grouping of Smooth Curves and Texture Segmentation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  John D. Owens,et al.  GPU Computing , 2008, Proceedings of the IEEE.

[24]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[25]  King-Ip Lin,et al.  The ANN-tree: an index for efficient approximate nearest neighbor search , 2001, Proceedings Seventh International Conference on Database Systems for Advanced Applications. DASFAA 2001.

[26]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[27]  Luc Van Gool,et al.  Fast scale invariant feature detection and matching on programmable graphics hardware , 2008, 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[28]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[29]  Paul Suetens,et al.  Efficient GPU-Based Texture Interpolation using Uniform B-Splines , 2008, J. Graph. Tools.

[30]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[31]  Laurent Amsaleg,et al.  GPU acceleration of Eff2 descriptors using CUDA , 2010, ACM Multimedia.

[32]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[33]  Bernd Girod,et al.  Fast geometric re-ranking for image-based retrieval , 2010, 2010 IEEE International Conference on Image Processing.

[34]  Yongdong Zhang,et al.  Pairwise weak geometric consistency for large scale image search , 2011, ICMR.

[35]  Kurt Keutzer,et al.  Fast support vector machine training and classification on graphics processors , 2008, ICML '08.

[36]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[37]  Xian-Sheng Hua,et al.  Robust video signature based on ordinal measure , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[38]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[39]  Xin Chen,et al.  City-scale landmark identification on mobile devices , 2011, CVPR 2011.

[40]  Linda G. Shapiro,et al.  A SIFT descriptor with global context , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[41]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[42]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[44]  Yongdong Zhang,et al.  Effective and Efficient Image Copy Detection Based on GPU , 2010, ECCV Workshops.

[45]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Shih-Fu Chang,et al.  Detecting image near-duplicate by stochastic attributed relational graph matching with learning , 2004, MULTIMEDIA '04.