D-Nets: Beyond patch-based image descriptors

Despite much research on patch-based descriptors, SIFT remains the gold standard for finding correspondences across images and recent descriptors focus primarily on improving speed rather than accuracy. In this paper we propose Descriptor-Nets (D-Nets), a computationally efficient method that significantly improves the accuracy of image matching by going beyond patch-based approaches. D-Nets constructs a network in which nodes correspond to traditional sparsely or densely sampled keypoints, and where image content is sampled from selected edges in this net. Not only is our proposed representation invariant to cropping, translation, scale, reflection and rotation, but it is also significantly more robust to severe perspective and non-linear distortions. We present several variants of our algorithm, including one that tunes itself to the image complexity and an efficient parallelized variant that employs a fixed grid. Comprehensive direct comparisons against SIFT and ORB on standard datasets demonstrate that D-Nets dominates existing approaches in terms of precision and recall while retaining computational efficiency.

[1]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[4]  Frédéric Jurie,et al.  Groups of Adjacent Contour Segments for Object Detection , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Luc Van Gool,et al.  Wide-baseline stereo matching with line segments , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[7]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[8]  Haim J. Wolfson,et al.  Geometric hashing: an overview , 1997 .

[9]  Martial Hebert,et al.  Beyond Local Appearance: Category Recognition from Pairwise Interactions of Simple Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Mei Chen,et al.  Food recognition using statistics of pairwise local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[12]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[13]  Tinne Tuytelaars,et al.  Dense interest points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[14]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[15]  Neil A. Thacker,et al.  Pairwise representations of shape , 1992, [1992] Proceedings. 11th IAPR International Conference on Pattern Recognition.

[16]  Shin'ichi Satoh,et al.  Compact correlation coding for visual object categorization , 2011, 2011 International Conference on Computer Vision.

[17]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[18]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[20]  Mark Hedley,et al.  Fast corner detection , 1998, Image Vis. Comput..