Semantic SLAM With More Accurate Point Cloud Map in Dynamic Environments

Static environment is a prerequisite for most existing vision-based SLAM (simultaneous localization and mapping) systems to work properly, which greatly limits the use of SLAM in real-world environments. The quality of the global point cloud map constructed by the SLAM system in a dynamic environment is related to the camera pose estimation and the removal of noise blocks in the local point cloud maps. Most dynamic SLAM systems mainly improve the accuracy of camera localization, but rarely study on noise blocks removal. In this paper, we proposed a novel semantic SLAM system with a more accurate point cloud map in dynamic environments. We obtained the masks and bounding boxes of the dynamic objects in the images by BlitzNet. The mask of a dynamic object was extended by analyzing the depth statistical information of the mask in the bounding box. The islands generated by the residual information of dynamic objects were removed by a morphological operation after geometric segmentation. With the bounding boxes, the images can be quickly divided into environment regions and dynamic regions, so the depth-stable matching points in the environment regions are used to construct epipolar constraints to locate the static matching points in the dynamic regions. In order to verify the preference of our proposed SLAM system, we conduct the experiments on the TUM RGB-D datasets. Compared with the state-of-the-art dynamic SLAM systems, the global point cloud map constructed by our system is the best.

[1]  Javier Civera,et al.  Towards semantic SLAM using a monocular camera , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Yuxiang Sun,et al.  Movable-Object-Aware Visual SLAM via Weakly Supervised Semantic Segmentation , 2019, ArXiv.

[3]  Nikolas Brasch,et al.  Semantic Monocular SLAM for Highly Dynamic Environments , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Patrick A. Naylor,et al.  Optimized Self-Localization for SLAM in Dynamic Scenes Using Probability Hypothesis Density Filters , 2018, IEEE Transactions on Signal Processing.

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Yasuyuki Matsushita,et al.  GMS: Grid-Based Motion Statistics for Fast, Ultra-robust Feature Correspondence , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[8]  Wolfram Burgard,et al.  OctoMap: an efficient probabilistic 3D mapping framework based on octrees , 2013, Autonomous Robots.

[9]  Federico Tombari,et al.  CNN-SLAM: Real-Time Dense Monocular SLAM with Learned Depth Prediction , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Vladlen Koltun,et al.  Events-To-Video: Bringing Modern Computer Vision to Event Cameras , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Song Wang,et al.  DDL-SLAM: A Robust RGB-D SLAM in Dynamic Environments Combined With Deep Learning , 2020, IEEE Access.

[12]  Davide Scaramuzza,et al.  EVO: A Geometric Approach to Event-Based 6-DOF Parallel Tracking and Mapping in Real Time , 2017, IEEE Robotics and Automation Letters.

[13]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Juan Song,et al.  Semantic SLAM Based on Object Detection and Improved Octomap , 2018, IEEE Access.

[15]  Ba-Ngu Vo,et al.  SLAM Gets a PHD: New Concepts in Map Estimation , 2014, IEEE Robotics & Automation Magazine.

[16]  Jong-Hwan Kim,et al.  Effective Background Model-Based RGB-D Dense Visual Odometry in a Dynamic Environment , 2016, IEEE Transactions on Robotics.

[17]  Davide Scaramuzza,et al.  Real-time Visual-Inertial Odometry for Event Cameras using Keyframe-based Nonlinear Optimization , 2017, BMVC.

[18]  Yuliang Tang,et al.  Dynamic objects elimination in SLAM based on image fusion , 2019, Pattern Recognit. Lett..

[19]  Davide Scaramuzza,et al.  EMVS: Event-based Multi-View Stereo , 2016, BMVC.

[20]  Yuxiang Sun,et al.  Motion removal for reliable RGB-D SLAM in dynamic environments , 2018, Robotics Auton. Syst..

[21]  Daniel Cremers,et al.  StaticFusion: Background Reconstruction for Dense RGB-D SLAM in Dynamic Environments , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Iori Kumagai,et al.  Multi-purpose SLAM framework for Dynamic Environment , 2020, 2020 IEEE/SICE International Symposium on System Integration (SII).

[24]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[25]  Javier Civera,et al.  DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes , 2018, IEEE Robotics and Automation Letters.

[26]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[27]  Linyan Cui,et al.  SDF-SLAM: Semantic Depth Filter SLAM for Dynamic Environments , 2020, IEEE Access.

[28]  Qingmin Liao,et al.  Robust RGB-D SLAM in dynamic environment using faster R-CNN , 2017, 2017 3rd IEEE International Conference on Computer and Communications (ICCC).

[29]  Tao Zhang,et al.  Robust RGB-D simultaneous localization and mapping using planar point features , 2015, Robotics Auton. Syst..

[30]  Zheng Rong,et al.  Dynamic-SLAM: Semantic monocular visual localization and mapping based on deep learning in dynamic environment , 2019, Robotics Auton. Syst..

[31]  Lourdes Agapito,et al.  MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects , 2018, 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[32]  Yuxiang Sun,et al.  Improving RGB-D SLAM in dynamic environments: A motion removal approach , 2017, Robotics Auton. Syst..

[33]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[34]  Mohammad Bozorg,et al.  SLAM in dynamic environments via ML-RANSAC , 2018 .

[35]  Ziqi Zhang,et al.  Detect-SLAM: Making Object Detection and SLAM Mutually Beneficial , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[36]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[37]  Binbin Xu,et al.  MID-Fusion: Octree-based Object-Level Multi-Instance Dynamic SLAM , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[38]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[39]  Michael Milford,et al.  Meaningful maps with object-oriented semantic mapping , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[40]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Julien Mairal,et al.  BlitzNet: A Real-Time Deep Network for Scene Understanding , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[42]  Dongheui Lee,et al.  RGB-D SLAM in Dynamic Environments Using Static Point Weighting , 2017, IEEE Robotics and Automation Letters.

[43]  Daniel Cremers,et al.  Dense visual SLAM for RGB-D cameras , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[44]  Danping Zou,et al.  CoSLAM: Collaborative Visual SLAM in Dynamic Environments , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Linyan Cui,et al.  SOF-SLAM: A Semantic Visual SLAM for Dynamic Environments , 2019, IEEE Access.

[46]  Xuanpeng Li,et al.  Semi-Dense 3D Semantic Mapping from Monocular SLAM , 2016, ArXiv.

[47]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[49]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[50]  Tuan D. Pham,et al.  DUNet: A deformable network for retinal vessel segmentation , 2018, Knowl. Based Syst..

[51]  Guihua Liu,et al.  DMS-SLAM: A General Visual SLAM System for Dynamic Scenes with Multiple Sensors , 2019, Sensors.

[52]  Qi Wei,et al.  DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[53]  Yu Zhang,et al.  RGB-D SLAM in Dynamic Environments Using Points Correlations , 2018, ArXiv.

[54]  Zhili Liu,et al.  A Compatible Framework for RGB-D SLAM in Dynamic Scenes , 2019, IEEE Access.