A Review of Visual-Inertial Simultaneous Localization and Mapping from Filtering-Based and Optimization-Based Perspectives

Visual-inertial simultaneous localization and mapping (VI-SLAM) is popular research topic in robotics. Because of its advantages in terms of robustness, VI-SLAM enjoys wide applications in the field of localization and mapping, including in mobile robotics, self-driving cars, unmanned aerial vehicles, and autonomous underwater vehicles. This study provides a comprehensive survey on VI-SLAM. Following a short introduction, this study is the first to review VI-SLAM techniques from filtering-based and optimization-based perspectives. It summarizes state-of-the-art studies over the last 10 years based on the back-end approach, camera type, and sensor fusion type. Key VI-SLAM technologies are also introduced such as feature extraction and tracking, core theory, and loop closure. The performance of representative VI-SLAM methods and famous VI-SLAM datasets are also surveyed. Finally, this study contributes to the comparison of filtering-based and optimization-based methods through experiments. A comparative study of VI-SLAM methods helps understand the differences in their operating principles. Optimization-based methods achieve excellent localization accuracy and lower memory utilization, while filtering-based methods have advantages in terms of computing resources. Furthermore, this study proposes future development trends and research directions for VI-SLAM. It provides a detailed survey of VI-SLAM techniques and can serve as a brief guide to newcomers in the field of SLAM and experienced researchers looking for possible directions for future work.

[1]  Carlo L. Bottasso,et al.  Tightly-coupled stereo vision-aided inertial navigation using feature-based motion sensors , 2014, Adv. Robotics.

[2]  Lianyu Zheng,et al.  Real-Time Motion Tracking for Mobile Augmented/Virtual Reality Using Adaptive Visual-Inertial Fusion , 2017, Sensors.

[3]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Sen Wang,et al.  VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem , 2017, AAAI.

[5]  A. Bab-Hadiashar,et al.  An Overview to Visual Odometry and Visual SLAM: Applications to Mobile Robotics , 2015 .

[6]  Mohamed Abouzahir,et al.  Embedding SLAM algorithms: Has it come of age? , 2018, Robotics Auton. Syst..

[7]  Michael Bosse,et al.  Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization , 2015, Robotics: Science and Systems.

[8]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[9]  Lindsay Kleeman Advanced sonar and odometry error modeling for simultaneous localisation and map building , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[10]  Wolfgang Hess,et al.  Real-time loop closure in 2D LIDAR SLAM , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Zheng Li-xin,et al.  Block Matching Algorithms for Motion Estimation , 2005 .

[12]  Il Hong Suh,et al.  Building a 3-D Line-Based Map Using Stereo SLAM , 2015, IEEE Transactions on Robotics.

[13]  Peter I. Corke,et al.  Visual Place Recognition: A Survey , 2016, IEEE Transactions on Robotics.

[14]  Jared Shamwell,et al.  Vision-Aided Absolute Trajectory Estimation Using an Unsupervised Deep Network with Online Error Correction , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Stergios I. Roumeliotis,et al.  A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[16]  Roland Siegwart,et al.  Real-time onboard visual-inertial state estimation and self-calibration of MAVs in unknown environments , 2012, 2012 IEEE International Conference on Robotics and Automation.

[17]  Davide Scaramuzza,et al.  The Zurich urban micro aerial vehicle dataset , 2017, Int. J. Robotics Res..

[18]  Gabe Sibley,et al.  Inertial aided dense & semi-dense methods for robust direct visual odometry , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Ron Alterovitz,et al.  Motion planning under uncertainty using iterative local optimization in belief space , 2012, Int. J. Robotics Res..

[20]  Anastasios I. Mourikis,et al.  High-precision, consistent EKF-based visual-inertial odometry , 2013, Int. J. Robotics Res..

[21]  Tao Zhang,et al.  Robust RGB-D simultaneous localization and mapping using planar point features , 2015, Robotics Auton. Syst..

[22]  Gordon Wyeth,et al.  RatSLAM: a hippocampal model for simultaneous localization and mapping , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[23]  Frank Dellaert,et al.  Eliminating conditionally independent sets in factor graphs: A unifying perspective based on smart factors , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Alain Pagani,et al.  Learning to Fuse: A Deep Learning Approach to Visual-Inertial Camera Pose Estimation , 2016, 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[25]  Dieter Fox,et al.  Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation , 2010, Int. J. Robotics Res..

[26]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[28]  Juan D. Tardós,et al.  Visual-Inertial Monocular SLAM With Map Reuse , 2016, IEEE Robotics and Automation Letters.

[29]  Hugh F. Durrant-Whyte,et al.  Simultaneous localization and mapping: part I , 2006, IEEE Robotics & Automation Magazine.

[30]  Basilio Bona,et al.  Active SLAM and Exploration with Particle Filters Using Kullback-Leibler Divergence , 2014, J. Intell. Robotic Syst..

[31]  José Ruíz Ascencio,et al.  Visual simultaneous localization and mapping: a survey , 2012, Artificial Intelligence Review.

[32]  Zhe Zhang,et al.  PIRVS: An Advanced Visual-Inertial SLAM System with Flexible Sensor Fusion and Hardware Co-Design , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Gaurav S. Sukhatme,et al.  Visual-Inertial Sensor Fusion: Localization, Mapping and Sensor-to-Sensor Self-calibration , 2011, Int. J. Robotics Res..

[34]  Jörg Stückler,et al.  The TUM VI Benchmark for Evaluating Visual-Inertial Odometry , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[35]  F. Fraundorfer,et al.  Visual Odometry : Part II: Matching, Robustness, Optimization, and Applications , 2012, IEEE Robotics & Automation Magazine.

[36]  François Michaud,et al.  Long-term online multi-session graph-based SPLAM with memory management , 2017, Autonomous Robots.

[37]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[39]  José A. Castellanos,et al.  On the Importance of Uncertainty Representation in Active SLAM , 2018, IEEE Transactions on Robotics.

[40]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[41]  Roland Siegwart,et al.  A robust and modular multi-sensor fusion approach applied to MAV navigation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Danping Zou,et al.  CoSLAM: Collaborative Visual SLAM in Dynamic Environments , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Mingyang Li,et al.  Improving the accuracy of EKF-based visual-inertial odometry , 2012, 2012 IEEE International Conference on Robotics and Automation.

[44]  Jörg Stückler,et al.  Keyframe-based visual-inertial online SLAM with relocalization , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[45]  Stefan Kohlbrecher,et al.  A flexible and scalable SLAM system with full 3D motion estimation , 2011, 2011 IEEE International Symposium on Safety, Security, and Rescue Robotics.

[46]  Shichao Yang,et al.  Pop-up SLAM: Semantic monocular plane SLAM for low-texture environments , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[47]  Dorian Gálvez-López,et al.  Bags of Binary Words for Fast Place Recognition in Image Sequences , 2012, IEEE Transactions on Robotics.

[48]  Soon-Jo Chung,et al.  The Visual–Inertial Canoe Dataset , 2018, Int. J. Robotics Res..

[49]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[50]  K. Madhava Krishna,et al.  Fast randomized planner for SLAM automation , 2012, 2012 IEEE International Conference on Automation Science and Engineering (CASE).

[51]  H. Durrant-Whyte,et al.  Simultaneous Localisation and Mapping ( SLAM ) : Part II State of the Art , 2006 .

[52]  Dimitrios G. Kottas,et al.  Consistency Analysis and Improvement of Vision-aided Inertial Navigation , 2014, IEEE Transactions on Robotics.

[53]  Luigi di Stefano,et al.  Fusion of Inertial and Visual Measurements for RGB-D SLAM on Mobile Devices , 2015, 2015 IEEE International Conference on Computer Vision Workshop (ICCVW).

[54]  Stergios I. Roumeliotis,et al.  IMU-RGBD camera 3D pose estimation and extrinsic calibration: Observability analysis and consistency improvement , 2013, 2013 IEEE International Conference on Robotics and Automation.

[55]  Tom Drummond,et al.  Faster and Better: A Machine Learning Approach to Corner Detection , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Joel A. Hesch,et al.  A comparative analysis of tightly-coupled monocular, binocular, and stereo VINS , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[57]  Pierre Vandergheynst,et al.  FREAK: Fast Retina Keypoint , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[58]  Michael Bosse,et al.  Keyframe-based visual–inertial odometry using nonlinear optimization , 2015, Int. J. Robotics Res..

[59]  Roland Siegwart,et al.  Keyframe-Based Visual-Inertial SLAM using Nonlinear Optimization , 2013, Robotics: Science and Systems.

[60]  Hua Zhu,et al.  Visual-inertial SLAM method based on optical flow in a GPS-denied environment , 2018, Ind. Robot.

[61]  Ryan M. Eustice,et al.  Active visual SLAM for robotic area coverage: Theory and experiment , 2015, Int. J. Robotics Res..

[62]  Vijay Kumar,et al.  Robust Stereo Visual Inertial Odometry for Fast Autonomous Flight , 2017, IEEE Robotics and Automation Letters.

[63]  Benjamin Kuipers,et al.  Factoring the Mapping Problem: Mobile Robot Map-building in the Hybrid Spatial Semantic Hierarchy , 2010, Int. J. Robotics Res..

[64]  Shin-Dug Kim,et al.  Adaptive Monocular Visual–Inertial SLAM for Real-Time Augmented Reality Applications in Mobile Devices , 2017, Sensors.

[65]  Ji Zhang,et al.  Visual-lidar odometry and mapping: low-drift, robust, and fast , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[66]  Roland Siegwart,et al.  Real-time metric state estimation for modular vision-inertial systems , 2011, 2011 IEEE International Conference on Robotics and Automation.

[67]  Friedrich Fraundorfer,et al.  Visual Odometry Part I: The First 30 Years and Fundamentals , 2022 .

[68]  Peter I. Corke,et al.  Monocular vision based autonomous navigation for a cost-effective MAV in GPS-denied environments , 2013, 2013 IEEE/ASME International Conference on Advanced Intelligent Mechatronics.

[69]  Thaier Hayajneh,et al.  Extrinsic Calibration of Camera and 2D Laser Sensors without Overlap , 2017, Sensors.

[70]  Otmar Hilliges,et al.  Duo-VIO: Fast, light-weight, stereo inertial odometry , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[71]  Roland Siegwart,et al.  The EuRoC micro aerial vehicle datasets , 2016, Int. J. Robotics Res..

[72]  Tao Zhang,et al.  Unsupervised learning to detect loops using deep neural networks for visual SLAM system , 2017, Auton. Robots.

[73]  Josef Sivic,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[74]  Jari Saarinen,et al.  3D normal distributions transform occupancy maps: An efficient representation for mapping in dynamic environments , 2013, Int. J. Robotics Res..

[75]  Charles K. Toth,et al.  Stereo-inertial Odometry Using Nonlinear Optimization , 2015 .

[76]  Salah Sukkarieh,et al.  Visual-Inertial-Aided Navigation for High-Dynamic Motion in Built Environments Without Initial Conditions , 2012, IEEE Transactions on Robotics.

[77]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[78]  Stefan Leutenegger,et al.  Unmanned Solar Airplanes: Design and Algorithms for Efficient and Robust Autonomous Operation , 2014 .

[79]  Gabe Sibley,et al.  Asynchronous Adaptive Conditioning for Visual-Inertial SLAM , 2014, ISER.

[80]  Yi Liu,et al.  Monocular Visual-Inertial SLAM: Continuous Preintegration and Reliable Initialization , 2017, Sensors.

[81]  Roland Siegwart,et al.  Real-time visual-inertial mapping, re-localization and planning onboard MAVs in unknown environments , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[82]  Roland Siegwart,et al.  Maplab: An Open Framework for Research in Visual-Inertial Mapping and Localization , 2017, IEEE Robotics and Automation Letters.

[83]  Shaojie Shen,et al.  Monocular Visual–Inertial State Estimation With Online Initialization and Camera–IMU Extrinsic Calibration , 2017, IEEE Transactions on Automation Science and Engineering.

[84]  Wenqi Wu,et al.  Tightly-Coupled Stereo Visual-Inertial Navigation Using Point and Line Features , 2015, Sensors.

[85]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[86]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[87]  Fei Gao,et al.  Real-time monocular dense mapping on aerial robots using visual-inertial fusion , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[88]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[89]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.

[90]  Kostas Daniilidis,et al.  PennCOSYVIO: A challenging Visual Inertial Odometry benchmark , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[91]  Shichao Yang,et al.  Direct monocular odometry using points and lines , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[92]  Stefano Soatto,et al.  Visual-inertial navigation, mapping and localization: A scalable real-time causal approach , 2011, Int. J. Robotics Res..

[93]  Shi-Sheng Huang,et al.  Map-Based Visual-Inertial Monocular SLAM using Inertial assisted Kalman Filter , 2017 .

[94]  Hujun Bao,et al.  ICE-BA: Incremental, Consistent and Efficient Bundle Adjustment for Visual-Inertial SLAM , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[95]  Roland Siegwart,et al.  Monocular‐SLAM–based navigation for autonomous micro helicopters in GPS‐denied environments , 2011, J. Field Robotics.

[96]  Darlan N. Brito,et al.  Evaluation of Interest Point Matching Methods for Projective Reconstruction of 3D Scenes , 2016, IEEE Latin America Transactions.

[97]  Jörg Stückler,et al.  Large-scale direct SLAM with stereo cameras , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[98]  Marc Pollefeys,et al.  Semi-direct EKF-based monocular visual-inertial odometry , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[99]  Emmanuel Nuño,et al.  A Visual-Aided Inertial Navigation and Mapping System , 2016 .

[100]  Federico Tombari,et al.  CNN-SLAM: Real-Time Dense Monocular SLAM with Learned Depth Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[101]  Michael Veth,et al.  Fusing Low-Cost Image and Inertial Sensors for Passive Navigation , 2007 .

[102]  Leonidas J. Guibas,et al.  3Dlite , 2017, ACM Trans. Graph..

[103]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[104]  Roland Siegwart,et al.  Iterated extended Kalman filter based visual-inertial odometry using direct photometric feedback , 2017, Int. J. Robotics Res..

[105]  Frank Dellaert,et al.  IMU Preintegration on Manifold for Efficient Visual-Inertial Maximum-a-Posteriori Estimation , 2015, Robotics: Science and Systems.

[106]  Davide Scaramuzza,et al.  Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High-Speed Scenarios , 2017, IEEE Robotics and Automation Letters.

[107]  Luigi di Stefano,et al.  SkiMap: An efficient mapping framework for robot navigation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[108]  P. Cheeseman,et al.  On the Representation and Estimation of , 2003 .

[109]  Randall Smith,et al.  Estimating Uncertain Spatial Relationships in Robotics , 1987, Autonomous Robot Vehicles.

[110]  Stergios I. Roumeliotis,et al.  A Square Root Inverse Filter for Efficient Vision-aided Inertial Navigation on Mobile Devices , 2015, Robotics: Science and Systems.

[111]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[112]  Alonzo Kelly,et al.  A new approach to vision-aided inertial navigation , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[113]  Jinyong Jeong,et al.  Road-SLAM : Road marking based SLAM with lane-level accuracy , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[114]  Gabe Sibley,et al.  Sliding window filter with application to planetary landing , 2010 .

[115]  Yi Lin,et al.  Autonomous aerial navigation using monocular visual‐inertial fusion , 2018, J. Field Robotics.

[116]  Frank Dellaert,et al.  Planning in the continuous domain: A generalized belief space approach for autonomous navigation in unknown environments , 2015, Int. J. Robotics Res..

[117]  Wilfried Enkelmann,et al.  Investigations of multigrid algorithms for the estimation of optical flow fields in image sequences , 1988, Comput. Vis. Graph. Image Process..

[118]  Vijay Kumar,et al.  Visual-inertial direct SLAM , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[119]  Shaojie Shen,et al.  Monocular Visual-Inertial State Estimation for Mobile Augmented Reality , 2017, 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[120]  Juyang Weng,et al.  A theory of image matching , 1990, [1990] Proceedings Third International Conference on Computer Vision.

[121]  Sebastian Thrun,et al.  Exploration in active learning , 1998 .

[122]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[123]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[124]  Jonathan Kelly,et al.  The Battle for Filter Supremacy: A Comparative Study of the Multi-State Constraint Kalman Filter and the Sliding Window Filter , 2015, 2015 12th Conference on Computer and Robot Vision.

[125]  Simon Baker,et al.  Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[126]  Jörg Stückler,et al.  Direct visual-inertial odometry with stereo cameras , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[127]  Stefan Leutenegger,et al.  Dense RGB-D-inertial SLAM with map deformations , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[128]  Stephan Weiss,et al.  Vision based navigation for micro helicopters , 2012 .

[129]  Roland Siegwart,et al.  Build Your Own Visual-Inertial Drone: A Cost-Effective and Open-Source Autonomous Drone , 2018, IEEE Robotics & Automation Magazine.

[130]  Dimitrios G. Kottas,et al.  Efficient Visual-Inertial Navigation using a Rolling-Shutter Camera with Inaccurate Timestamps , 2014, Robotics: Science and Systems.

[131]  Roland Siegwart,et al.  Onboard IMU and monocular vision based control for MAVs in unknown in- and outdoor environments , 2011, 2011 IEEE International Conference on Robotics and Automation.

[132]  S. Shankar Sastry,et al.  An Invitation to 3-D Vision: From Images to Geometric Models , 2003 .