A comparison of feature descriptors for visual SLAM

Feature detection and feature description plays an important part in Visual Simultaneous Localization and Mapping (VSLAM). Visual features are commonly used to efficiently estimate the motion of the camera (visual odometry) and link the current image to previously visited parts of the environment (place recognition, loop closure). Gradient histogram-based feature descriptors, like SIFT and SURF, are frequently used for this task. Recently introduced binary descriptors, as BRIEF or BRISK, claim to offer similar capabilities at lower computational cost. In this paper, we will compare the most popular feature descriptors in a typical graph-based VSLAM algorithm using two publicly available datasets to determine the impact of the choice for feature descriptor in terms of accuracy and speed in a realistic scenario.

[1]  Hong Zhang,et al.  Performance evaluation of visual SLAM using several feature extractors , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Tom Drummond,et al.  Machine Learning for High-Speed Corner Detection , 2006, ECCV.

[3]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[4]  Kurt Konolige,et al.  Double window optimisation for constant time visual SLAM , 2011, 2011 International Conference on Computer Vision.

[5]  S. Govindarajulu,et al.  A Comparison of SIFT, PCA-SIFT and SURF , 2012 .

[6]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[7]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Erik Maehle,et al.  Graph-based visual SLAM and visual odometry using an RGB-D camera , 2013, 9th International Workshop on Robot Motion and Control.

[9]  Luo Juan,et al.  A comparison of SIFT, PCA-SIFT and SURF , 2009 .

[10]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[11]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[12]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[13]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[14]  Wolfram Burgard,et al.  Towards a benchmark for RGB-D SLAM evaluation , 2011, RSS 2011.

[15]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[17]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[19]  Wolfram Burgard,et al.  An evaluation of the RGB-D SLAM system , 2012, 2012 IEEE International Conference on Robotics and Automation.

[20]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[21]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[22]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.