ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM

This article presents ORB-SLAM3, the first system able to perform visual, visual-inertial and multimap SLAM with monocular, stereo and RGB-D cameras, using pin-hole and fisheye lens models. The first main novelty is a tightly integrated visual-inertial SLAM system that fully relies on maximum a posteriori (MAP) estimation, even during IMU initialization, resulting in real-time robust operation in small and large, indoor and outdoor environments, being two to ten times more accurate than previous approaches. The second main novelty is a multiple map system relying on a new place recognition method with improved recall that lets ORB-SLAM3 survive to long periods of poor visual information: when it gets lost, it starts a new map that will be seamlessly merged with previous maps when revisiting them. Compared with visual odometry systems that only use information from the last few seconds, ORB-SLAM3 is the first system able to reuse in all the algorithm stages all previous information from high parallax co-visible keyframes, even if they are widely separated in time or come from previous mapping sessions, boosting accuracy. Our experiments show that, in all sensor configurations, ORB-SLAM3 is as robust as the best systems available in the literature and significantly more accurate. Notably, our stereo-inertial SLAM achieves an average accuracy of 3.5 cm in the EuRoC drone and 9 mm under quick hand-held motions in the room of TUM-VI dataset, representative of AR/VR scenarios. For the benefit of the community we make public the source code.

[1]  Laurent Kneip,et al.  Collaborative monocular SLAM with multiple Micro Aerial Vehicles , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Tom Drummond,et al.  Unified Loop Closing and Recovery for Real Time Monocular SLAM , 2008, BMVC.

[3]  Gabe Sibley,et al.  MOARSLAM: Multiple Operator Augmented RSLAM , 2014, DARS.

[4]  Jörg Stückler,et al.  The TUM VI Benchmark for Evaluating Visual-Inertial Odometry , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[5]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[6]  Flavio Fontana,et al.  Simultaneous State Initialization and Gyroscope Bias Calibration in Visual Inertial Aided Navigation , 2017, IEEE Robotics and Automation Letters.

[7]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[8]  Ian D. Reid,et al.  Mapping Large Loops with a Single Hand-Held Camera , 2007, Robotics: Science and Systems.

[9]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[10]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[11]  Jörg Stückler,et al.  Visual-Inertial Mapping With Non-Linear Factor Recovery , 2019, IEEE Robotics and Automation Letters.

[12]  Salah Sukkarieh,et al.  Visual-Inertial-Aided Navigation for High-Dynamic Motion in Built Environments Without Initial Conditions , 2012, IEEE Transactions on Robotics.

[13]  David W. Murray,et al.  Video-rate localization in multiple maps for wearable augmented reality , 2008, 2008 12th IEEE International Symposium on Wearable Computers.

[14]  Javier Civera,et al.  1-Point RANSAC for extended Kalman filtering: Application to real-time structure from motion and visual odometry , 2010 .

[15]  Luca Carlone,et al.  Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Andrew J. Davison,et al.  Real-time simultaneous localisation and mapping with a single camera , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[17]  Javier Civera,et al.  Loosely-Coupled Semi-Direct Monocular SLAM , 2019, IEEE Robotics and Automation Letters.

[18]  Dorian Gálvez-López,et al.  Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[19]  Michael Bosse,et al.  Keyframe-based visual–inertial odometry using nonlinear optimization , 2015, Int. J. Robotics Res..

[20]  Davide Scaramuzza,et al.  SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Daniel Cremers,et al.  Stereo DSO: Large-Scale Direct Sparse Visual Odometry with Stereo Cameras , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[22]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Anastasios I. Mourikis,et al.  High-precision, consistent EKF-based visual-inertial odometry , 2013, Int. J. Robotics Res..

[24]  Juan D. Tardós,et al.  Fast relocalisation and loop closing in keyframe-based SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[25]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[26]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[27]  Davide Scaramuzza,et al.  A Benchmark Comparison of Monocular Visual-Inertial Odometry Algorithms for Flying Robots , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Patrik Schmuck,et al.  Multi-UAV collaborative monocular SLAM , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Carlos Campos,et al.  Inertial-Only Optimization for Visual-Inertial Initialization , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Javier Civera,et al.  C2TAM: A Cloud framework for cooperative tracking and mapping , 2014, Robotics Auton. Syst..

[31]  Kurt Konolige,et al.  Double window optimisation for constant time visual SLAM , 2011, 2011 International Conference on Computer Vision.

[32]  Daniel Cremers,et al.  Direct Sparse Visual-Inertial Odometry Using Dynamic Marginalization , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33]  David W. Murray,et al.  Improving the Agility of Keyframe-Based SLAM , 2008, ECCV.

[34]  Juan D. Tardós,et al.  Visual-Inertial Monocular SLAM With Map Reuse , 2016, IEEE Robotics and Automation Letters.

[35]  David W. Murray,et al.  Parallel Tracking and Mapping on a camera phone , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[36]  J. M. M. Montiel,et al.  ORBSLAM-Atlas: a robust and accurate multi-map system , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Stergios I. Roumeliotis,et al.  Alternating-Stereo VINS: Observability Analysis and Performance Evaluation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Denis Grießbach,et al.  Vision aided inertial navigation , 2010 .

[40]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[41]  Jörg Stückler,et al.  Large-scale direct SLAM with stereo cameras , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[42]  J. M. M. Montiel,et al.  Fast and Robust Initialization for Visual-Inertial SLAM , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[43]  Roland Siegwart,et al.  Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization , 2013, Robotics: Science and Systems.

[44]  Javier Civera,et al.  Inverse Depth Parametrization for Monocular SLAM , 2008, IEEE Transactions on Robotics.

[45]  Roland Siegwart,et al.  The EuRoC micro aerial vehicle datasets , 2016, Int. J. Robotics Res..

[46]  Agostino Martinelli,et al.  Closed-Form Solution of Visual-Inertial Structure from Motion , 2013, International Journal of Computer Vision.

[47]  Michael Gassner,et al.  SVO: Semidirect Visual Odometry for Monocular and Multicamera Systems , 2017, IEEE Transactions on Robotics.

[48]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[49]  Hauke Strasdat,et al.  Visual SLAM: Why filter? , 2012, Image Vis. Comput..

[50]  Javier Civera,et al.  1‐Point RANSAC for extended Kalman filtering: Application to real‐time structure from motion and visual odometry , 2010, J. Field Robotics.

[51]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[52]  Roland Siegwart,et al.  Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback , 2017, Int. J. Robotics Res..

[53]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[55]  Chu Kiong Loo,et al.  SLAMM: Visual monocular SLAM with continuous mapping using multiple maps , 2018, PloS one.

[56]  Nan Yang,et al.  D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Juho Kannala,et al.  A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[58]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[59]  Daniel Cremers,et al.  LDSO: Direct Sparse Odometry with Loop Closure , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[60]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[61]  Jörg Stückler,et al.  Omnidirectional DSO: Direct Sparse Odometry With Fisheye Cameras , 2018, IEEE Robotics and Automation Letters.

[62]  Margarita Chli,et al.  CCM‐SLAM: Robust and efficient centralized collaborative monocular simultaneous localization and mapping for robotic teams , 2018, J. Field Robotics.

[63]  Shaojie Shen,et al.  A General Optimization-based Framework for Local Odometry Estimation with Multiple Sensors , 2019, ArXiv.

[64]  Hauke Strasdat,et al.  Scale Drift-Aware Large Scale Monocular SLAM , 2010, Robotics: Science and Systems.

[65]  Frank Dellaert,et al.  On-Manifold Preintegration for Real-Time Visual--Inertial Odometry , 2015, IEEE Transactions on Robotics.

[66]  Stefan Hinz,et al.  MLPnP - A Real-Time Maximum Likelihood Solution to the Perspective-n-Point Problem , 2016, ArXiv.

[67]  Joel A. Hesch,et al.  A comparative analysis of tightly-coupled monocular, binocular, and stereo VINS , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[68]  Iker Aguinaga,et al.  Direct Sparse Mapping , 2019, IEEE Transactions on Robotics.