A Benchmark Comparison of Monocular Visual-Inertial Odometry Algorithms for Flying Robots

Flying robots require a combination of accuracy and low latency in their state estimation in order to achieve stable and robust flight. However, due to the power and payload constraints of aerial platforms, state estimation algorithms must provide these qualities under the computational constraints of embedded hardware. Cameras and inertial measurement units (IMUs) satisfy these power and payload constraints, so visual-inertial odometry (VIO) algorithms are popular choices for state estimation in these scenarios, in addition to their ability to operate without external localization from motion capture or global positioning systems. It is not clear from existing results in the literature, however, which VIO algorithms perform well under the accuracy, latency, and computational constraints of a flying robot with onboard state estimation. This paper evaluates an array of publicly-available VIO pipelines (MSCKF, OKVIS, ROVIO, VINS-Mono, SVO+MSF, and SVO+GTSAM) on different hardware configurations, including several single-board computer systems that are typically found on flying robots. The evaluation considers the pose estimation accuracy, per-frame processing time, and CPU and memory load while processing the EuRoC datasets, which contain six degree of freedom (6DoF) trajectories typical of flying robots. We present our complete results as a benchmark for the research community.

[1]  Roland Siegwart,et al.  Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization , 2013, Robotics: Science and Systems.

[2]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[3]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Roland Siegwart,et al.  The EuRoC micro aerial vehicle datasets , 2016, Int. J. Robotics Res..

[5]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[7]  Kostas Daniilidis,et al.  Event-Based Visual Inertial Odometry , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Arno Solin,et al.  PIVO: Probabilistic Inertial-Visual Odometry for Occlusion-Robust Navigation , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[10]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[11]  Sebastian Scherer,et al.  A Multi-Sensor Fusion MAV State Estimation from Long-Range Stereo, IMU, GPS and Barometric Sensors , 2016, Sensors.

[12]  Jason M. O'Kane,et al.  Experimental Comparison of Open Source Vision-Based State Estimation Algorithms , 2016, ISER.

[13]  Michael F. P. O'Boyle,et al.  Introducing SLAMBench, a performance and accuracy benchmarking methodology for SLAM , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[14]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[15]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[17]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Juan D. Tardós,et al.  Visual-Inertial Monocular SLAM With Map Reuse , 2016, IEEE Robotics and Automation Letters.

[19]  Frank Dellaert,et al.  On-Manifold Preintegration for Real-Time Visual--Inertial Odometry , 2015, IEEE Transactions on Robotics.

[20]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[21]  Kevin Eckenhoff,et al.  Direct visual-inertial navigation with analytical preintegration , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Michael Gassner,et al.  SVO: Semidirect Visual Odometry for Monocular and Multicamera Systems , 2017, IEEE Transactions on Robotics.

[23]  F. Dellaert Factor Graphs and GTSAM: A Hands-on Introduction , 2012 .

[24]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[25]  S. Umeyama,et al.  Least-Squares Estimation of Transformation Parameters Between Two Point Patterns , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Sven Behnke,et al.  Combining Feature-Based and Direct Methods for Semi-dense Real-Time Stereo Visual Odometry , 2016, IAS.

[27]  Flavio Fontana,et al.  Autonomous, Vision‐based Flight and Live Dense 3D Mapping with a Quadrotor Micro Aerial Vehicle , 2016, J. Field Robotics.

[28]  Joel A. Hesch,et al.  A comparative analysis of tightly-coupled monocular, binocular, and stereo VINS , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Daniel Cremers,et al.  A Photometrically Calibrated Benchmark For Monocular Visual Odometry , 2016, ArXiv.

[30]  Sen Wang,et al.  VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem , 2017, AAAI.

[31]  Roland Siegwart,et al.  A robust and modular multi-sensor fusion approach applied to MAV navigation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[32]  Frank Dellaert,et al.  iSAM2: Incremental smoothing and mapping using the Bayes tree , 2012, Int. J. Robotics Res..

[33]  Francisco Angel Moreno,et al.  The Málaga urban dataset: High-rate stereo and LiDAR in a realistic urban scenario , 2014, Int. J. Robotics Res..

[34]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Sven Behnke,et al.  Efficient multi-camera visual-inertial SLAM for micro aerial vehicles , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).