DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes

The assumption of scene rigidity is typical in SLAM algorithms. Such a strong assumption limits the use of most visual SLAM systems in populated real-world environments, which are the target of several relevant applications like service robotics or autonomous vehicles. In this letter we present DynaSLAM, a visual SLAM system that, building on ORB-SLAM2, adds the capabilities of dynamic object detection and background inpainting. DynaSLAM is robust in dynamic scenarios for monocular, stereo, and RGB-D configurations. We are capable of detecting the moving objects either by multiview geometry, deep learning, or both. Having a static map of the scene allows inpainting the frame background that has been occluded by such dynamic objects. We evaluate our system in public monocular, stereo, and RGB-D datasets. We study the impact of several accuracy/speed trade-offs to assess the limits of the proposed methodology. DynaSLAM outperforms the accuracy of standard visual SLAM baselines in highly dynamic scenarios. And it also estimates a map of the static parts of the scene, which is a must for long-term applications in real-world environments.

[1]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[2]  Daniel Cremers,et al.  Real-Time Dense Geometry from a Handheld Camera , 2010, DAGM-Symposium.

[3]  Javier Civera,et al.  DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[5]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Yuxiang Sun,et al.  Improving RGB-D SLAM in dynamic environments: A motion removal approach , 2017, Robotics Auton. Syst..

[7]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[8]  Jong-Hwan Kim,et al.  Effective Background Model-Based RGB-D Dense Visual Odometry in a Dynamic Environment , 2016, IEEE Transactions on Robotics.

[9]  Luis Miguel Bergasa,et al.  On combining visual SLAM and dense scene flow to increase the robustness of localization and mapping in dynamic environments , 2012, 2012 IEEE International Conference on Robotics and Automation.

[10]  Dongheui Lee,et al.  RGB-D SLAM in Dynamic Environments Using Static Point Weighting , 2017, IEEE Robotics and Automation Letters.

[11]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[12]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[13]  Hujun Bao,et al.  Robust monocular SLAM in dynamic environments , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[14]  Shoudong Huang,et al.  Motion segmentation based robust RGB-D SLAM , 2014, Proceeding of the 11th World Congress on Intelligent Control and Automation.

[15]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[17]  Shoudong Huang,et al.  Towards dense moving object segmentation based robust dense RGB-D SLAM in dynamic scenarios , 2014, 2014 13th International Conference on Control Automation Robotics & Vision (ICARCV).

[18]  Dirk-Jan Kroon,et al.  Evaluation of the potential of automatic segmentation of the mandibular canal using cone-beam computed tomography. , 2014, The British journal of oral & maxillofacial surgery.

[19]  Somkiat Wangsiripitak,et al.  Avoiding moving outliers in visual SLAM by tracking moving objects , 2009, 2009 IEEE International Conference on Robotics and Automation.

[20]  Horst Bischof,et al.  Online 3D reconstruction using convex optimization , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[21]  Alexei A. Efros,et al.  Context Encoders: Feature Learning by Inpainting , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Luis Montano,et al.  Semantic visual SLAM in populated environments , 2017, 2017 European Conference on Mobile Robots (ECMR).

[23]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[24]  Bastian Leibe,et al.  Real-time RGB-D based people detection and tracking for mobile robots and head-worn cameras , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[26]  Rares Ambrus,et al.  Unsupervised object segmentation through change detection in a long term autonomy scenario , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).