VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator

One camera and one low-cost inertial measurement unit (IMU) form a monocular visual-inertial system (VINS), which is the minimum sensor suite (in size, weight, and power) for the metric six degrees-of-freedom (DOF) state estimation. In this paper, we present VINS-Mono: a robust and versatile monocular visual-inertial state estimator. Our approach starts with a robust procedure for estimator initialization. A tightly coupled, nonlinear optimization-based method is used to obtain highly accurate visual-inertial odometry by fusing preintegrated IMU measurements and feature observations. A loop detection module, in combination with our tightly coupled formulation, enables relocalization with minimum computation. We additionally perform 4-DOF pose graph optimization to enforce the global consistency. Furthermore, the proposed system can reuse a map by saving and loading it in an efficient way. The current and previous maps can be merged together by the global pose graph optimization. We validate the performance of our system on public datasets and real-world experiments and compare against other state-of-the-art algorithms. We also perform an onboard closed-loop autonomous flight on the microaerial-vehicle platform and port the algorithm to an iOS-based demonstration. We highlight that the proposed work is a reliable, complete, and versatile system that is applicable for different applications that require high accuracy in localization. We open source our implementations for both PCs (https://github.com/HKUST-Aerial-Robotics/VINS-Mono) and iOS mobile devices ( https://github.com/HKUST-Aerial-Robotics/VINS-Mobile).

[1]  Marc Pollefeys,et al.  CamOdoCal: Automatic intrinsic and extrinsic calibration of a rig with multiple generic cameras and odometry , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Shaojie Shen,et al.  Relocalization, Global Optimization and Map Merging for Monocular Visual-Inertial SLAM , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Roland Siegwart,et al.  The EuRoC micro aerial vehicle datasets , 2016, Int. J. Robotics Res..

[4]  Frederick R. Forst,et al.  On robust estimation of the location parameter , 1980 .

[5]  Shaojie Shen,et al.  Monocular Visual-Inertial State Estimation for Mobile Augmented Reality , 2017, 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[6]  Vijay Kumar,et al.  Tightly-coupled monocular visual-inertial fusion for autonomous flight of rotorcraft MAVs , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Juan D. Tardós,et al.  Visual-Inertial Monocular SLAM With Map Reuse , 2016, IEEE Robotics and Automation Letters.

[8]  Dorian Gálvez-López,et al.  Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[9]  Flavio Fontana,et al.  Automatic re-initialization and failure recovery for aggressive flight with a monocular vision-based quadrotor , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Jörg Stückler,et al.  Direct visual-inertial odometry with stereo cameras , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[12]  Gaurav S. Sukhatme,et al.  Sliding window filter with application to planetary landing , 2010, J. Field Robotics.

[13]  Roland Siegwart,et al.  A robust and modular multi-sensor fusion approach applied to MAV navigation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Davide Scaramuzza,et al.  SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Frank Dellaert,et al.  iSAM2: Incremental smoothing and mapping using the Bayes tree , 2012, Int. J. Robotics Res..

[16]  Yi Lin,et al.  Autonomous aerial navigation using monocular visual‐inertial fusion , 2018, J. Field Robotics.

[17]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[18]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Frank Dellaert,et al.  IMU Preintegration on Manifold for Efficient Visual-Inertial Maximum-a-Posteriori Estimation , 2015, Robotics: Science and Systems.

[20]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[21]  Michael Bosse,et al.  Keyframe-based visual–inertial odometry using nonlinear optimization , 2015, Int. J. Robotics Res..

[22]  Davide Scaramuzza,et al.  A Benchmark Comparison of Monocular Visual-Inertial Odometry Algorithms for Flying Robots , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Shaojie Shen,et al.  Monocular Visual–Inertial State Estimation With Online Initialization and Camera–IMU Extrinsic Calibration , 2017, IEEE Transactions on Automation Science and Engineering.

[24]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[26]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[27]  Shaojie Shen,et al.  Spline-Based Initialization of Monocular Visual–Inertial State Estimators at High Altitude , 2017, IEEE Robotics and Automation Letters.

[28]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Salah Sukkarieh,et al.  Visual-Inertial-Aided Navigation for High-Dynamic Motion in Built Environments Without Initial Conditions , 2012, IEEE Transactions on Robotics.

[30]  Joel A. Hesch,et al.  A comparative analysis of tightly-coupled monocular, binocular, and stereo VINS , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[32]  Hauke Strasdat,et al.  Scale Drift-Aware Large Scale Monocular SLAM , 2010, Robotics: Science and Systems.

[33]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[34]  Roland Siegwart,et al.  Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments , 2012, 2012 IEEE International Conference on Robotics and Automation.

[35]  Anastasios I. Mourikis,et al.  High-precision, consistent EKF-based visual-inertial odometry , 2013, Int. J. Robotics Res..

[36]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Stergios I. Roumeliotis,et al.  A Square Root Inverse Filter for Efficient Vision-aided Inertial Navigation on Mobile Devices , 2015, Robotics: Science and Systems.

[39]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[40]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[41]  Vijay Kumar,et al.  Initialization-Free Monocular Visual-Inertial State Estimation with Application to Autonomous MAVs , 2014, ISER.

[42]  Shaojie Shen,et al.  Robust initialization of monocular visual-inertial estimation on aerial robots , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[43]  P. J. Huber Robust Estimation of a Location Parameter , 1964 .

[44]  Flavio Fontana,et al.  Simultaneous State Initialization and Gyroscope Bias Calibration in Visual Inertial Aided Navigation , 2017, IEEE Robotics and Automation Letters.

[45]  Patrick Rives,et al.  Single View Point Omnidirectional Camera Calibration from Planar Grids , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[46]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[47]  Agostino Martinelli,et al.  Closed-Form Solution of Visual-Inertial Structure from Motion , 2013, International Journal of Computer Vision.

[48]  Frank Dellaert,et al.  On-Manifold Preintegration for Real-Time Visual--Inertial Odometry , 2015, IEEE Transactions on Robotics.

[49]  Marc Pollefeys,et al.  Multiple view geometry , 2005 .