HeteroFusion: Dense Scene Reconstruction Integrating Multi-Sensors

We present a novel approach to integrate data from multiple sensor types for dense 3D reconstruction of indoor scenes in realtime. Existing algorithms are mainly based on a single RGBD camera and thus require continuous scanning of areas with sufficient geometric features. Otherwise, tracking may fail due to unreliable frame registration. Inspired by the fact that the fusion of multiple sensors can combine their strengths towards a more robust and accurate self-localization, we incorporate multiple types of sensors which are prevalent in modern robot systems, including a 2D range sensor, an inertial measurement unit (IMU), and wheel encoders. We fuse their measurements to reinforce the tracking process and to eventually obtain better 3D reconstructions. Specifically, we develop a 2D truncated signed distance field (TSDF) volume representation for the integration and ray-casting of laser frames, leading to a unified cost function in the pose estimation stage. For validation of the estimated poses in the loop-closure optimization process, we train a classifier for the features extracted from heterogeneous sensors during the registration progress. To evaluate our method on challenging use case scenarios, we assembled a scanning platform prototype to acquire real-world scans. We further simulated synthetic scans based on high-fidelity synthetic scenes for quantitative evaluation. Extensive experimental evaluation on these two types of scans demonstrate that our system is capable of robustly acquiring dense 3D reconstructions and outperforms state-of-the-art RGBD and LiDAR systems.

[1]  SunXin,et al.  Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices , 2015 .

[2]  Ji Zhang,et al.  Visual-lidar odometry and mapping: low-drift, robust, and fast , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Roberto Cipolla,et al.  Understanding RealWorld Indoor Scenes with Synthetic Data , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Shi-Min Hu,et al.  Real-time High-accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras , 2018, ACM Trans. Graph..

[6]  Tomás Svoboda,et al.  Robust Data Fusion of Multimodal Sensory Information for Mobile Robots , 2015, J. Field Robotics.

[7]  Thierry Peynot,et al.  Reliable automatic camera-laser calibration , 2010, ICRA 2010.

[8]  Michael Bosse,et al.  Zebedee: Design of a Spring-Mounted 3-D Range Sensor with Application to Mobile Mapping , 2012, IEEE Transactions on Robotics.

[9]  Duc Thanh Nguyen,et al.  SceneNN: A Scene Meshes Dataset with aNNotations , 2016, 2016 Fourth International Conference on 3D Vision (3DV).

[10]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[11]  Matthias Nießner,et al.  BundleFusion , 2016, TOGS.

[12]  Dan Wu,et al.  Real-time 3D mapping using a 2D laser scanner and IMU-aided visual SLAM , 2017, 2017 IEEE International Conference on Real-time Computing and Robotics (RCAR).

[13]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[14]  Simon Lacroix,et al.  ICP-based pose-graph SLAM , 2016, 2016 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).

[15]  Stefan Leutenegger,et al.  ElasticFusion: Dense SLAM Without A Pose Graph , 2015, Robotics: Science and Systems.

[16]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[17]  Shaojie Shen,et al.  VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[18]  Timothy C. Havens,et al.  Heterogeneous Multisensor Fusion for Mobile Platform Three-Dimensional Pose Estimation , 2017 .

[19]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[20]  Derek D. Lichti,et al.  IMU and Multiple RGB-D Camera Fusion for Assisting Indoor Stop-and-Go 3D Terrestrial Laser Scanning , 2014, Robotics.

[21]  Olaf Kähler,et al.  Real-Time Large-Scale Dense 3D Reconstruction with Loop Closure , 2016, ECCV.

[22]  John J. Leonard,et al.  Real-time large-scale dense RGB-D SLAM with volumetric fusion , 2014, Int. J. Robotics Res..

[23]  Kun Zhou,et al.  Online Structure Analysis for Real-Time Indoor Scene Reconstruction , 2015, ACM Trans. Graph..

[24]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Sören Larsson,et al.  Path planning for laser scanning with an industrial robot , 2008, Robotics Auton. Syst..

[26]  Yang Gao,et al.  Visual-LiDAR Odometry Aided by Reduced IMU , 2016, ISPRS Int. J. Geo Inf..

[27]  John J. Leonard,et al.  Robust real-time visual odometry for dense RGB-D mapping , 2013, 2013 IEEE International Conference on Robotics and Automation.

[28]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Ling Qin,et al.  Three-Dimensional Indoor Mobile Mapping With Fusion of Two-Dimensional Laser Scanner and RGB-D Camera Data , 2014, IEEE Geoscience and Remote Sensing Letters.

[30]  Ben Glocker,et al.  Real-Time RGB-D Camera Relocalization via Randomized Ferns for Keyframe Encoding , 2015, IEEE Transactions on Visualization and Computer Graphics.

[31]  Roland Siegwart,et al.  Robust visual inertial odometry using a direct EKF-based approach , 2015, IROS 2015.

[32]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[33]  Gamini Dissanayake,et al.  Convergence and Consistency Analysis for Extended Kalman Filter Based SLAM , 2007, IEEE Transactions on Robotics.

[34]  John J. Leonard,et al.  Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age , 2016, IEEE Transactions on Robotics.

[35]  Wolfram Burgard,et al.  A Tutorial on Graph-Based SLAM , 2010, IEEE Intelligent Transportation Systems Magazine.

[36]  Jia Li,et al.  Probabilistic Multi-Sensor Fusion Based Indoor Positioning System on a Mobile Device , 2015, Sensors.

[37]  Wolfgang Hess,et al.  Real-time loop closure in 2D LIDAR SLAM , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Jeffrey K. Uhlmann,et al.  New extension of the Kalman filter to nonlinear systems , 1997, Defense, Security, and Sensing.

[39]  Andrew Zisserman,et al.  Efficient additive kernels via explicit feature maps , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[40]  Bin Chen,et al.  Object-aware guidance for autonomous scene reconstruction , 2018, ACM Trans. Graph..

[41]  Dorian Gálvez-López,et al.  Real-time loop detection with bags of binary words , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[42]  Wolfram Burgard,et al.  Improved Techniques for Grid Mapping With Rao-Blackwellized Particle Filters , 2007, IEEE Transactions on Robotics.

[43]  Daniel Cremers,et al.  Robust odometry estimation for RGB-D cameras , 2013, 2013 IEEE International Conference on Robotics and Automation.

[44]  Olaf Kähler,et al.  Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices , 2015, IEEE Transactions on Visualization and Computer Graphics.

[45]  Peter C. Cheeseman,et al.  Estimating uncertain spatial relationships in robotics , 1986, Proceedings. 1987 IEEE International Conference on Robotics and Automation.