Monocular Visual-IMU Odometry: A Comparative Evaluation of Detector–Descriptor-Based Methods

Monocular visual-inertial measurement unit (IMU) odometry has been widely used in various intelligent vehicles. As a popular technique, detector–descriptor-based visual-IMU odometry is effective and efficient due to the fact that local descriptors are robust against occlusions, background clutter, and abrupt content changes. However, to our knowledge, there is not a comprehensive and comparative evaluation study on the performance of different combinations of detectors and descriptors recently developed. In order to bridge this gap, we conduct such a comparative study in a unified framework. In particular, six typical routes with different lengths, shapes, and road scenes are selected from the well-known KITTI dataset. We first evaluate the performance of different combinations of salient point detectors and local descriptors using the six routes. Then, we tune the parameters of the best detector or descriptor obtained for each route, to further augment the results. This paper provides not only comprehensive benchmarks for assessing various algorithms but also instructive guidelines and insights for developing detectors and descriptors to handle different road scenes.

[1]  Roland Siegwart,et al.  Real-time monocular visual odometry for on-road vehicles with 1-point RANSAC , 2009, 2009 IEEE International Conference on Robotics and Automation.

[2]  Niklas Bergström,et al.  Feature Descriptors for Tracking by Detection: a Benchmark , 2016, ArXiv.

[3]  Eli Shechtman,et al.  Matching Local Self-Similarities across Images and Videos , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[5]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Bin Fan,et al.  Local Intensity Order Pattern for feature description , 2011, 2011 International Conference on Computer Vision.

[8]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[9]  Nabil Aouf,et al.  Multispectral Stereo Odometry , 2015, IEEE Transactions on Intelligent Transportation Systems.

[10]  Andrew Zisserman,et al.  A Statistical Approach to Material Classification Using Image Patch Exemplars , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Jitendra Malik,et al.  Representing and Recognizing the Visual Appearance of Materials using Three-dimensional Textons , 2001, International Journal of Computer Vision.

[12]  Julius Ziegler,et al.  StereoScan: Dense 3d reconstruction in real-time , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[13]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[14]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[15]  Jiri Matas,et al.  Robust wide-baseline stereo from maximally stable extremal regions , 2004, Image Vis. Comput..

[16]  Reinhard Klette,et al.  When to use what feature? SIFT, SURF, ORB, or A-KAZE features for monocular visual odometry , 2016, 2016 International Conference on Image and Vision Computing New Zealand (IVCNZ).

[17]  Ignacio Parra,et al.  Accurate Global Localization Using Visual Odometry and Digital Maps on Urban Environments , 2012, IEEE Transactions on Intelligent Transportation Systems.

[18]  Adam Schmidt,et al.  An Evaluation of Image Feature Detectors and Descriptors for Robot Navigation , 2010, ICCVG.

[19]  Sinisa Segvic,et al.  Experimental Evaluation of Autonomous Driving Based on Visual Memory and Image-Based Visual Servoing , 2011, IEEE Transactions on Intelligent Transportation Systems.

[20]  Jan-Michael Frahm,et al.  Comparative Evaluation of Binary Features , 2012, ECCV.

[21]  Ales Leonardis,et al.  Visual Object Tracking Performance Measures Revisited , 2015, IEEE Transactions on Image Processing.

[22]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Cordelia Schmid,et al.  An Affine Invariant Interest Point Detector , 2002, ECCV.

[24]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[25]  Larry H. Matthies,et al.  Robust and Efficient Stereo Feature Tracking for Visual Odometry , 2008, 2008 IEEE International Conference on Robotics and Automation.

[26]  Kurt Konolige,et al.  CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching , 2008, ECCV.

[27]  Friedrich Fraundorfer,et al.  Visual Odometry Part I: The First 30 Years and Fundamentals , 2022 .

[28]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Natasha Govender,et al.  Evaluation of feature detection algorithms for structure from motion , 2009 .

[30]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[31]  Darius Burschka,et al.  Adaptive and Generic Corner Detection Based on the Accelerated Segment Test , 2010, ECCV.

[32]  Peter Corke,et al.  An Introduction to Inertial and Visual Sensing , 2007, Int. J. Robotics Res..

[33]  Andreas Geiger,et al.  Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[34]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[35]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[36]  Erkan Bostanci,et al.  Spatial Statistics of Image Features for Performance Comparison , 2014, IEEE Transactions on Image Processing.

[37]  P. Handel,et al.  Realtime implementation of visual-aided inertial navigation using epipolar constraints , 2012, Proceedings of the 2012 IEEE/ION Position, Location and Navigation Symposium.

[38]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[39]  Binoy Pinto,et al.  Speeded Up Robust Features , 2011 .

[40]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[41]  Anil K. Jain,et al.  A modified Hausdorff distance for object matching , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[42]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[43]  Óscar Martínez Mozos,et al.  A comparative evaluation of interest point detectors and local descriptors for visual SLAM , 2010, Machine Vision and Applications.

[44]  Szymon Rusinkiewicz,et al.  Learning to Detect Features in Texture Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[46]  C. Lawrence Zitnick,et al.  Edge foci interest points , 2011, 2011 International Conference on Computer Vision.

[47]  Cordelia Schmid,et al.  A performance evaluation of local descriptors , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[49]  Shawn D. Newsam,et al.  Geographic Image Retrieval Using Local Invariant Features , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[50]  Alex Zelinsky,et al.  Learning OpenCV---Computer Vision with the OpenCV Library (Bradski, G.R. et al.; 2008)[On the Shelf] , 2009, IEEE Robotics & Automation Magazine.

[51]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[52]  Mingyang Li,et al.  Improving the accuracy of EKF-based visual-inertial odometry , 2012, 2012 IEEE International Conference on Robotics and Automation.

[53]  Michael Bosse,et al.  Keyframe-based visual–inertial odometry using nonlinear optimization , 2015, Int. J. Robotics Res..

[54]  Jwu-Sheng Hu,et al.  A sliding-window visual-IMU odometer based on tri-focal tensor geometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[55]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[56]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[57]  Huiyu Zhou,et al.  Object tracking using SIFT features and mean shift , 2009, Comput. Vis. Image Underst..

[58]  Junyu Dong,et al.  Monocular Visual-IMU Odometry: A Comparative Evaluation of the Detector-Descriptor Based Methods , 2016, ECCV Workshops.

[59]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[60]  Paulo Vinicius Koerich Borges,et al.  Practical Infrared Visual Odometry , 2016, IEEE Transactions on Intelligent Transportation Systems.

[61]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Yong Liu,et al.  Performance evaluation of feature detection and matching in stereo visual odometry , 2013, Neurocomputing.

[63]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Mahmoud Belhocine,et al.  SIFT and SURF Performance Evaluation for Mobile Robot-Monocular Visual Odometry , 2014 .

[65]  Junyu Dong,et al.  Monocular visual-IMU odometry using multi-channel image patch exemplars , 2017, Multimedia Tools and Applications.

[66]  Tobias Höllerer,et al.  Evaluation of Interest Point Detectors and Feature Descriptors for Visual Tracking , 2011, International Journal of Computer Vision.

[67]  Tinne Tuytelaars,et al.  Dense interest points , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[68]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.