A comparative analysis of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK

Image registration is the process of matching, aligning and overlaying two or more images of a scene, which are captured from different viewpoints. It is extensively used in numerous vision based applications. Image registration has five main stages: Feature Detection and Description; Feature Matching; Outlier Rejection; Derivation of Transformation Function; and Image Reconstruction. Timing and accuracy of feature-based Image Registration mainly depend on computational efficiency and robustness of the selected feature-detector-descriptor, respectively. Therefore, choice of feature-detector-descriptor is a critical decision in feature-matching applications. This article presents a comprehensive comparison of SIFT, SURF, KAZE, AKAZE, ORB, and BRISK algorithms. It also elucidates a critical dilemma: Which algorithm is more invariant to scale, rotation and viewpoint changes? To investigate this problem, image matching has been performed with these features to match the scaled versions (5% to 500%), rotated versions (0° to 360°), and perspective-transformed versions of standard images with the original ones. Experiments have been conducted on diverse images taken from benchmark datasets: University of OXFORD, MATLAB, VLFeat, and OpenCV. Nearest-Neighbor-Distance-Ratio has been used as the feature-matching strategy while RANSAC has been applied for rejecting outliers and fitting the transformation models. Results are presented in terms of quantitative comparison, feature-detection-description time, feature-matching time, time of outlier-rejection and model fitting, repeatability, and error in recovered results as compared to the ground-truths. SIFT and BRISK are found to be the most accurate algorithms while ORB and BRISK are most efficient. The article comprises rich information that will be very useful for making important decisions in vision based applications and main aim of this work is to set a benchmark for researchers, regardless of any particular area.

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[3]  Cordelia Schmid,et al.  A Comparison of Affine Region Detectors , 2005, International Journal of Computer Vision.

[4]  Jiri Matas,et al.  Matching with PROSAC - progressive sample consensus , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Matthew A. Brown,et al.  Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[7]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[8]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[10]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[11]  Daniel Mirota,et al.  A Generalized Kernel Consensus-Based Robust Estimator , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Albert S. Huang,et al.  Visual Odometry and Mapping for Autonomous Flight Using an RGB-D Camera , 2011, ISRR.

[13]  Geoff Wyvill,et al.  SIFT and SURF Performance Evaluation against Various Image Deformations on Benchmark Dataset , 2011, 2011 International Conference on Digital Image Computing: Techniques and Applications.

[14]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[15]  Tobias Höllerer,et al.  Evaluation of Interest Point Detectors and Feature Descriptors for Visual Tracking , 2011, International Journal of Computer Vision.

[16]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[17]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[18]  Adrien Bartoli,et al.  KAZE Features , 2012, ECCV.

[19]  Adrien Bartoli,et al.  Fast Explicit Diffusion for Accelerated Features in Nonlinear Scale Spaces , 2013, BMVC.

[20]  크리스찬 피치하임,et al.  Monocular visual slam with general and panorama camera movements , 2014 .

[21]  Steffen Urban,et al.  FINDING A GOOD FEATURE DETECTOR-DESCRIPTOR COMBINATION FOR THE 2D KEYPOINT-BASED REGISTRATION OF TLS POINT CLOUDS , 2015 .

[22]  Levente Hajder,et al.  Quantitative Comparison of Feature Matchers Implemented in OpenCV3 , 2016 .

[23]  Hammam A. Alshazly,et al.  Image Features Detection, Description and Matching , 2016 .

[24]  Éric Marchand,et al.  Pose Estimation for Augmented Reality: A Hands-On Survey , 2016, IEEE Transactions on Visualization and Computer Graphics.

[25]  Reinhard Klette,et al.  When to use what feature? SIFT, SURF, ORB, or A-KAZE features for monocular visual odometry , 2016, 2016 International Conference on Image and Vision Computing New Zealand (IVCNZ).

[26]  David Fleer,et al.  Comparing holistic and feature-based visual methods for estimating the relative pose of mobile robots , 2017, Robotics Auton. Syst..