SplitFusion: Simultaneous Tracking and Mapping for Non-Rigid Scenes

We present SplitFusion, a novel dense RGB-D SLAM framework that simultaneously performs tracking and dense reconstruction for both rigid and non-rigid components of the scene. SplitFusion first adopts deep learning based semantic instant segmentation technique to split the scene into rigid or non-rigid surfaces. The split surfaces are independently tracked via rigid or non-rigid ICP and reconstructed through incremental depth map fusion. Experimental results show that the proposed approach can provide not only accurate environment maps but also well-reconstructed non-rigid targets, e.g. the moving humans.

[1]  Thomas Brox,et al.  DeepTAM: Deep Tracking and Mapping , 2018, ECCV.

[2]  Matthias Nießner,et al.  DeepDeform: Learning Non-Rigid RGB-D Reconstruction With Semi-Supervised Data , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Matthias Nießner,et al.  BundleFusion , 2016, TOGS.

[4]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[6]  Yoshihiko Nakamura,et al.  HRPSlam: A Benchmark for RGB-D Dynamic SLAM and Humanoid Vision , 2019, 2019 Third IEEE International Conference on Robotic Computing (IRC).

[7]  Thomas A. Funkhouser,et al.  Min-cut based segmentation of point clouds , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[8]  Stefan Leutenegger,et al.  ElasticFusion: Real-time dense SLAM and light source estimation , 2016, Int. J. Robotics Res..

[9]  Tim Weyrich,et al.  Real-Time 3D Reconstruction in Dynamic Scenes Using Point-Based Fusion , 2013, 2013 International Conference on 3D Vision.

[10]  Justus Thies,et al.  Neural Non-Rigid Tracking , 2020, NeurIPS.

[11]  Matthias Nießner,et al.  VolumeDeform: Real-Time Volumetric Non-rigid Reconstruction , 2016, ECCV.

[12]  Matthias Zwicker,et al.  Surfels: surface elements as rendering primitives , 2000, SIGGRAPH.

[13]  Marc Alexa,et al.  As-rigid-as-possible surface modeling , 2007, Symposium on Geometry Processing.

[14]  Qionghai Dai,et al.  DoubleFusion: Real-Time Capture of Human Performances with Inner Body Shapes from a Single Depth Sensor , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Yoshihiko Nakamura,et al.  FlowFusion: Dynamic Dense RGB-D SLAM Based on Optical Flow , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Wei Gao,et al.  SurfelWarp: Efficient Non-Volumetric Single View Dynamic Reconstruction , 2018, Robotics: Science and Systems.

[17]  Matthias Nießner,et al.  Learning to Optimize Non-Rigid Tracking , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Pushmeet Kohli,et al.  Fusion4D , 2016, ACM Trans. Graph..

[19]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Lourdes Agapito,et al.  Co-fusion: Real-time segmentation, tracking and fusion of multiple objects , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Paul J. Besl,et al.  Method for registration of 3-D shapes , 1992, Other Conferences.

[22]  Yoshihiko Nakamura,et al.  PoseFusion: Dense RGB-D SLAM in Dynamic Human Environments , 2018, ISER.

[23]  Peter-Pike J. Sloan,et al.  Interactive ray tracing for isosurface rendering , 1998, Proceedings Visualization '98 (Cat. No.98CB36276).

[24]  Qi Wei,et al.  DS-SLAM: A Semantic Visual SLAM towards Dynamic Environments , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  M. Pauly,et al.  Embedded deformation for shape manipulation , 2007, SIGGRAPH 2007.

[26]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[27]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[28]  Daniel Cremers,et al.  StaticFusion: Background Reconstruction for Dense RGB-D SLAM in Dynamic Environments , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[30]  Yong Jae Lee,et al.  YOLACT: Real-Time Instance Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Slobodan Ilic,et al.  SobolevFusion: 3D Reconstruction of Scenes Undergoing Free Non-rigid Motion , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Yang Li,et al.  Pose Graph optimization for Unsupervised Monocular Visual Odometry , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[35]  Lourdes Agapito,et al.  MaskFusion: Real-Time Recognition, Tracking and Reconstruction of Multiple Moving Objects , 2018, 2018 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[36]  Daniel Cremers,et al.  KillingFusion: Non-rigid 3D Reconstruction without Correspondences , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Hans-Peter Seidel,et al.  Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data , 2009, TOGS.