Real-time Visual-inertial Localization using Summary Maps

Localization in a global reference frame constitutes a fundamental milestone towards high-level applications in robotics such as autonomous navigation and obstacle avoidance. Visual-inertial SLAM became a compelling approach for this task, despite its inherent drift and local pose estimation. A solution to these shortcomings, however, can be achieved by matching against maps built in previous sessions. Commonly, a careful data selection is performed to keep the map size traceable and thus enable localization in real-time. Although, such a map summarization usually guarantees global localization coverage, the accuracy suffers due to fewer matches. In this work, we aim at mitigating this effect by directly integrating the scarce 2d-3d matches with visual feature tracks and inertial measurements in the framework of a slidingwindow based optimization. We compare our approach to motion tracking data and demonstrate that such a joint estimation yields smoother and more accurate global pose estimates than related methods that loosely integrate 6-DoF localization poses with VIO. Finally, we evaluate the impact of varying map summarization parameters on the trade-off between map-size and localization accuracy and demonstrate that our approach allows for a more aggressive summarization while retaining the robustness and accuracy achieved with larger maps.

[1]  Dimitrios G. Kottas,et al.  Detecting and dealing with hovering maneuvers in vision-aided inertial navigation systems , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Dieter Schmalstieg,et al.  Global Localization from Monocular SLAM on a Mobile Phone , 2014, IEEE Transactions on Visualization and Computer Graphics.

[3]  J.-Y. Bouguet,et al.  Pyramidal implementation of the lucas kanade feature tracker , 1999 .

[4]  Davide Scaramuzza,et al.  Air-ground localization and map augmentation using monocular dense reconstruction , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Kazuya Yoshida,et al.  Collaborative mapping of an earthquake‐damaged building via ground and aerial robots , 2012, J. Field Robotics.

[6]  F. Dellaert Factor Graphs and GTSAM: A Hands-on Introduction , 2012 .

[7]  Teresa A. Vidal-Calleja,et al.  Large scale multiple robot visual mapping with heterogeneous landmarks in semi-structured terrain , 2011, Robotics Auton. Syst..

[8]  Juan D. Tardós,et al.  Hierarchical SLAM: real-time accurate mapping of large environments , 2005, IEEE Transactions on Robotics.

[9]  Michael Bosse,et al.  Placeless Place-Recognition , 2014, 2014 2nd International Conference on 3D Vision.

[10]  Roland Siegwart,et al.  Starleth: A compliant quadrupedal robot for fast, efficient, and versatile locomotion , 2012 .

[11]  Frank Dellaert,et al.  IMU Preintegration on Manifold for Efficient Visual-Inertial Maximum-a-Posteriori Estimation , 2015, Robotics: Science and Systems.

[12]  Nicholas Roy,et al.  Optimization-Based Estimator Design for Vision-Aided Inertial Navigation , 2013 .

[13]  Javier Civera,et al.  C2TAM: A Cloud framework for cooperative tracking and mapping , 2014, Robotics Auton. Syst..

[14]  Torsten Sattler,et al.  Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[15]  Roland Siegwart,et al.  Comparison of nearest-neighbor-search strategies and implementations for efficient shape registration , 2012 .

[16]  Anastasios I. Mourikis,et al.  Online temporal calibration for camera–IMU systems: Theory and algorithms , 2014, Int. J. Robotics Res..

[17]  Michael Bosse,et al.  Get Out of My Lab: Large-scale, Real-Time Visual-Inertial Localization , 2015, Robotics: Science and Systems.

[18]  Torsten Sattler,et al.  Improving Image-Based Localization by Active Correspondence Search , 2012, ECCV.

[19]  Roland Siegwart,et al.  A synchronized visual-inertial sensor system with FPGA pre-processing for accurate real-time SLAM , 2014, ICRA 2014.

[20]  Joel A. Hesch,et al.  Large-scale cooperative 3D visual-inertial mapping in a Manhattan world , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Victor Lempitsky,et al.  The inverted multi-index , 2012, CVPR.

[22]  Darius Burschka,et al.  Adaptive and Generic Corner Detection Based on the Accelerated Segment Test , 2010, ECCV.

[23]  Laurent Kneip,et al.  Collaborative monocular SLAM with multiple Micro Aerial Vehicles , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Michael Bosse,et al.  Summary Maps for Lifelong Visual Localization , 2016, J. Field Robotics.

[25]  Roland Siegwart,et al.  Real-time visual-inertial localization for aerial and ground robots , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26]  Paul Newman,et al.  Appearance-only SLAM at large scale with FAB-MAP 2.0 , 2011, Int. J. Robotics Res..

[27]  Dimitrios G. Kottas,et al.  Camera-IMU-based localization: Observability analysis and consistency improvement , 2014, Int. J. Robotics Res..

[28]  Roland Siegwart,et al.  Map API - scalable decentralized map building for robots , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[29]  John J. Leonard,et al.  Location utility-based map reduction , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Michael Bosse,et al.  The gist of maps - summarizing experience for lifelong localization , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Torsten Sattler,et al.  Scalable 6-DOF Localization on Mobile Devices , 2014, ECCV.

[32]  Roland Siegwart,et al.  BRISK: Binary Robust invariant scalable keypoints , 2011, 2011 International Conference on Computer Vision.