KMOP-vSLAM: Dynamic Visual SLAM for RGB-D Cameras using K-means and OpenPose

Although tremendous progress has been made in Simultaneous Localization and Mapping (SLAM), the scene rigidity assumption limits wide usage of visual SLAMs in the real-world environment of computer vision, smart robotics and augmented reality. To make SLAM more robust in dynamic environments, outliers on the dynamic objects, including unknown objects, need to be removed from tracking process. To address this challenge, we present a novel real-time visual SLAM system, KMOP-vSLAM, which adds the capability of unsupervised learning segmentation and human detection to reduce the drift error of tracking in indoor dynamic environments. An efficient geometric outlier detection method is proposed, using dynamic information of the previous frames as well as a novel probability model to judge moving objects with the help of geometric constraints and human detection. Outlier features belonging to moving objects are largely detected and removed from tracking. The well-known dataset, TUM, is used to evaluate tracking errors in dynamic scenes where people are walking around. Our approach yields a significantly lower trajectory error compared to state-of-the-art visual SLAMs using an RGB-D camera.

[1]  Daniel Cremers,et al.  Dense visual SLAM for RGB-D cameras , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[3]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[4]  Jintao Yao,et al.  Semantic SLAM With More Accurate Point Cloud Map in Dynamic Environments , 2020, IEEE Access.

[5]  Kaichang Di,et al.  A New RGB-D SLAM Method with Moving Object Detection for Dynamic Indoor Scenes , 2019, Remote. Sens..

[6]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[7]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Andrew Zisserman,et al.  Multiple View Geometry in Computer Vision (2nd ed) , 2003 .

[9]  Qi Wei,et al.  DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[10]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[11]  Shoudong Huang,et al.  Motion segmentation based robust RGB-D SLAM , 2014, Proceeding of the 11th World Congress on Intelligent Control and Automation.

[12]  Dongheui Lee,et al.  RGB-D SLAM in Dynamic Environments Using Static Point Weighting , 2017, IEEE Robotics and Automation Letters.

[13]  Li Li,et al.  DM-SLAM: A Feature-Based SLAM System for Rigid Dynamic Scenes , 2020, ISPRS Int. J. Geo Inf..

[14]  Yuxiang Sun,et al.  Improving RGB-D SLAM in dynamic environments: A motion removal approach , 2017, Robotics Auton. Syst..

[15]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[16]  Javier Civera,et al.  DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes , 2018, IEEE Robotics and Automation Letters.

[17]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.