$\mathbb {H}$H-Patches: A Benchmark and Evaluation of Handcrafted and Learned Loca

In this paper, a novel benchmark is introduced for evaluating local image descriptors. We demonstrate limitations of the commonly used datasets and evaluation protocols, that lead to ambiguities and contradictory results in the literature. Furthermore, these benchmarks are nearly saturated due to the recent improvements in local descriptors obtained by learning from large annotated datasets. To address these issues, we introduce a new large dataset suitable for training and testing modern descriptors, together with strictly defined evaluation protocols in several tasks such as matching, retrieval and verification. This allows for more realistic, thus more reliable comparisons in different application scenarios. We evaluate the performance of several state-of-the-art descriptors and analyse their properties. We show that a simple normalisation of traditional hand-crafted descriptors is able to boost their performance to the level of deep learning based descriptors once realistic benchmarks are considered. Additionally we specify a protocol for learning and evaluating using cross validation. We show that when training state-of-the-art descriptors on this dataset, the traditional verification task is almost entirely saturated.

[1]  Tomasz Malisiewicz,et al.  SuperPoint: Self-Supervised Interest Point Detection and Description , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  Jiri Matas,et al.  Repeatability Is Not Enough: Learning Affine Regions via Discriminability , 2017, ECCV.

[3]  Jiri Matas,et al.  Learning Discriminative Affine Regions via Discriminability , 2017, ArXiv.

[4]  Bin Fan,et al.  L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ondrej Chum,et al.  Multiple-Kernel Local-Patch Descriptor , 2017, BMVC.

[6]  Shih-Fu Chang,et al.  Learning Discriminative and Transformation Covariant Local Feature Detectors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jiri Matas,et al.  Working hard to know your neighbor's margins: Local descriptor learning loss , 2017, NIPS.

[8]  Andrea Vedaldi,et al.  HPatches: A Benchmark and Evaluation of Handcrafted and Learned Local Descriptors , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Vassileios Balntas,et al.  Efficient learning of local image descriptors , 2016 .

[10]  Jiwen Lu,et al.  Learning Compact Binary Descriptors with Unsupervised Deep Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yung-Yu Chuang,et al.  Accumulated Stability Voting: A Robust Descriptor from Descriptors of Multiple Scales , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Andrea Vedaldi,et al.  Learning Covariant Feature Detectors , 2016, ECCV Workshops.

[13]  I. Pandzic,et al.  Learning local descriptors by optimizing the keypoint-correspondence criterion , 2016, 2016 23rd International Conference on Pattern Recognition (ICPR).

[14]  Vincent Lepetit,et al.  LIFT: Learned Invariant Feature Transform , 2016, ECCV.

[15]  Cordelia Schmid,et al.  Convolutional Patch Representations for Image Retrieval: An Unsupervised Approach , 2016, International Journal of Computer Vision.

[16]  F. Perronnin,et al.  Local Convolutional Features with Unsupervised Training for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[18]  Vincent Lepetit,et al.  Learning to Assign Orientations to Feature Points , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Hervé Jégou,et al.  Kernel Local Descriptors with Implicit Rotation Matching , 2015, ICMR.

[20]  Krystian Mikolajczyk,et al.  BOLD - Binary online learned descriptor for efficient image matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Rahul Sukthankar,et al.  MatchNet: Unifying feature and metric learning for patch-based matching , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Nikos Komodakis,et al.  Learning to compare image patches via convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Vincent Lepetit,et al.  Learning Image Descriptors with Boosting , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Tal Hassner,et al.  LATCH: Learned arrangements of three patch codes , 2015, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[25]  Stefano Soatto,et al.  Domain-size pooling in local descriptors: DSP-SIFT , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Vincent Lepetit,et al.  TILDE: A Temporally Invariant Learned DEtector , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Andrew Zisserman,et al.  Learning Local Feature Descriptors Using Convex Optimisation , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Thomas Brox,et al.  Discriminative Unsupervised Feature Learning with Convolutional Neural Networks , 2014, NIPS.

[29]  Konrad Schindler,et al.  Predicting Matchability , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Shiliang Zhang,et al.  USB: Ultrashort Binary Descriptor for Fast Visual Matching and Retrieval , 2014, IEEE Transactions on Image Processing.

[31]  Thomas Brox,et al.  Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT , 2014, ArXiv.

[32]  Lu Tian,et al.  OSRI: A Rotationally Invariant Binary Descriptor , 2014, IEEE Transactions on Image Processing.

[33]  Rudy Lauwereins,et al.  Derivative-Based Scale Invariant Image Feature Detector With Error Resilience , 2014, IEEE Transactions on Image Processing.

[34]  Bernd Girod,et al.  Feature Matching Performance of Compact Descriptors for Visual Search , 2014, 2014 Data Compression Conference.

[35]  Svetlana Lazebnik,et al.  Multi-scale Orderless Pooling of Deep Convolutional Activation Features , 2014, ECCV.

[36]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[37]  Rudy Lauwereins,et al.  SIFER: Scale-Invariant Feature Detector with Error Resilience , 2013, International Journal of Computer Vision.

[38]  Luigi di Stefano,et al.  Keypoints from symmetries by wave propagation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Edwin Olson,et al.  Learning convolutional filters for interest point detection , 2013, 2013 IEEE International Conference on Robotics and Automation.

[40]  Christian Osendorfer,et al.  Convolutional Neural Networks Learn Compact Local Image Descriptors , 2013, ICONIP.

[41]  David J. Kriegman,et al.  Locally Uniform Comparison Image Descriptor , 2012, NIPS.

[42]  Krystian Mikolajczyk,et al.  Evaluation of local detectors and descriptors for fast feature matching , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[43]  Vincent Lepetit,et al.  Efficient Discriminative Projections for Compact Binary Descriptors , 2012, ECCV.

[44]  Adrien Bartoli,et al.  KAZE Features , 2012, ECCV.

[45]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  R. Kouskouridas,et al.  Improving the robustness in feature detection by local contrast enhancement , 2012, 2012 IEEE International Conference on Imaging Systems and Techniques Proceedings.

[47]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  K. S. Pedersen,et al.  Interesting Interest Points , 2012, International Journal of Computer Vision.

[50]  Bin Fan,et al.  Local Intensity Order Pattern for feature description , 2011, 2011 International Conference on Computer Vision.

[51]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[52]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[53]  C. Lawrence Zitnick,et al.  Edge foci interest points , 2011, 2011 International Conference on Computer Vision.

[54]  Tom Drummond,et al.  Binary Histogrammed Intensity Patches for Efficient and Robust Matching , 2011, International Journal of Computer Vision.

[55]  Zhanyi Hu,et al.  Aggregating gradient distributions into intensity orders: A novel local image descriptor , 2011, CVPR 2011.

[56]  Bodo Rosenhahn,et al.  Increasing the accuracy of feature evaluation benchmarks using differential evolution , 2011, 2011 IEEE Symposium on Differential Evolution (SDE).

[57]  Jean-Michel Morel,et al.  ASIFT: An Algorithm for Fully Affine Invariant Comparison , 2011, Image Process. Line.

[58]  Hongping Cai,et al.  Learning Linear Discriminant Projections for Dimensionality Reduction of Image Descriptors , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[59]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[60]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[61]  Darius Burschka,et al.  Adaptive and Generic Corner Detection Based on the Accelerated Segment Test , 2010, ECCV.

[62]  Pascal Fua,et al.  Training for Task Specific Keypoint Detection , 2009, DAGM-Symposium.

[63]  Matthew A. Brown,et al.  Picking the best DAISY , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  Sam S. Tsai,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  Jiri Matas,et al.  Learning Fast Emulators of Binary Decision Processes , 2009, International Journal of Computer Vision.

[66]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[67]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[68]  Jiri Matas,et al.  Improving Descriptors for Fast Tree Matching by Optimal Linear Projection , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[69]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[70]  Matthew A. Brown,et al.  Learning Local Image Descriptors , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[72]  Robert Pless,et al.  Consistent Temporal Variations in Many Outdoor Scenes , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[73]  Leonardo Trujillo,et al.  Using Evolution to Learn How to Perform Interest Point Detection , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[74]  Mark Goadrich,et al.  The relationship between Precision-Recall and ROC curves , 2006, ICML.

[75]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[76]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[77]  C. Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[78]  B. Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[79]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[80]  R. Sukthankar,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[81]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[82]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[83]  Luc Van Gool,et al.  Affine/ Photometric Invariants for Planar Intensity Patterns , 1996, ECCV.

[84]  Matti Pietikäinen,et al.  Performance evaluation of texture measures with classification based on Kullback discrimination of distributions , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[85]  Edward H. Adelson,et al.  The Design and Use of Steerable Filters , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[86]  J. Koenderink,et al.  Representation of local geometry in the visual system , 1987, Biological Cybernetics.

[87]  Krystian Mikolajczyk,et al.  Learning local feature descriptors with triplets and shallow convolutional neural networks , 2016, BMVC.

[88]  Serge J. Belongie,et al.  Learning to Detect and Match Keypoints with Deep Architectures , 2016, BMVC.

[89]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[90]  Matthew A. Brown,et al.  Discriminative Learning of Local Image Descriptors , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[91]  J. P. Lewis,et al.  Fast Template Matching , 2009 .

[92]  Christopher Hunt SURF: Speeded-Up Robust Features , 2009 .

[93]  Horst Bischof,et al.  Learned local descriptors for recognition and matching , 2008 .

[94]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[95]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[96]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[97]  To appear in: CAV 2007. Shape Analysis for Composite Data Structures , 2022 .