SDF-TAR: Parallel Tracking and Refinement in RGB-D Data using Volumetric Registration

This paper introduces SDF-TAR: a real-time SLAM system based on volumetric registration in RGB-D data. While the camera is tracked online on the GPU, the most recently estimated poses are jointly refined on the CPU. We perform registration by aligning the data in limited-extent volumes anchored at salient 3D locations. This strategy permits efficient tracking on the GPU. Furthermore, the small memory load of the partial volumes allows for pose refinement to be done concurrently on the CPU. This refinement is performed over batches of a fixed number of frames, which are jointly optimized until the next batch becomes available. Thus drift is reduced during online operation, eliminating the need for any posterior processing. Evaluating on two public benchmarks, we demonstrate improved rotational motion estimation and higher reconstruction precision than related methods.

[1]  Daniel Cremers,et al.  Volumetric 3D mapping in real-time on a CPU , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Leonidas J. Guibas,et al.  Robust global registration , 2005, SGP '05.

[3]  Vladlen Koltun,et al.  Dense scene reconstruction with points of interest , 2013, ACM Trans. Graph..

[4]  Babak Taati,et al.  Difference of Normals as a Multi-scale Operator in Unorganized Point Clouds , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[5]  Andrew W. Fitzgibbon,et al.  Large-scale and drift-free surface reconstruction using online subvolume registration , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[7]  Martin Rumpf,et al.  Robust feature detection and local classification for surfaces based on moment analysis , 2004, IEEE Transactions on Visualization and Computer Graphics.

[8]  Luc Van Gool,et al.  Efficient Non-Maximum Suppression , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[9]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[10]  Sander Oude Elberink,et al.  Accuracy and Resolution of Kinect Depth Data for Indoor Mapping Applications , 2012, Sensors.

[11]  Stefan Leutenegger,et al.  ElasticFusion: Dense SLAM Without A Pose Graph , 2015, Robotics: Science and Systems.

[12]  Achim J. Lilienthal,et al.  SDF Tracker: A parallel algorithm for on-line pose estimation and scene reconstruction from depth images , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13]  Tim Weyrich,et al.  Real-Time 3D Reconstruction in Dynamic Scenes Using Point-Based Fusion , 2013, 2013 International Conference on 3D Vision.

[14]  Daniel Cremers,et al.  Real-Time Camera Tracking and 3D Reconstruction Using Signed Distance Functions , 2013, Robotics: Science and Systems.

[15]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Ming Zeng,et al.  Octree-based fusion for realtime 3D reconstruction , 2013, Graph. Model..

[17]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[18]  Nassir Navab,et al.  Model globally, match locally: Efficient and robust 3D object recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Stefan Holzer,et al.  Learning to Efficiently Detect Repeatable Interest Points in Depth Data , 2012, ECCV.

[21]  Helmut Pottmann,et al.  Registration without ICP , 2004, Comput. Vis. Image Underst..

[22]  Dieter Fox,et al.  Depth kernel descriptors for object recognition , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Dieter Fox,et al.  RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments , 2010, ISER.

[24]  Didier Stricker,et al.  CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2 , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[25]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[26]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Olaf Kähler,et al.  Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices , 2015, IEEE Transactions on Visualization and Computer Graphics.

[28]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[29]  Daniel Cremers,et al.  Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences , 2013, 2013 IEEE International Conference on Computer Vision.

[30]  Dieter Fox,et al.  Patch Volumes: Segmentation-Based Consistent Mapping with RGB-D Cameras , 2013, 2013 International Conference on 3D Vision.

[31]  Ronald Fedkiw,et al.  Level set methods and dynamic implicit surfaces , 2002, Applied mathematical sciences.

[32]  Nassir Navab,et al.  SDF-2-SDF: Highly Accurate 3D Object Reconstruction , 2016, ECCV.

[33]  Gérard G. Medioni,et al.  Object modeling by registration of multiple range images , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[34]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[35]  John J. Leonard,et al.  Robust real-time visual odometry for dense RGB-D mapping , 2013, 2013 IEEE International Conference on Robotics and Automation.

[36]  Wolfram Burgard,et al.  An evaluation of the RGB-D SLAM system , 2012, 2012 IEEE International Conference on Robotics and Automation.

[37]  Jiawen Chen,et al.  Scalable real-time volumetric surface reconstruction , 2013, ACM Trans. Graph..

[38]  Vladlen Koltun,et al.  Elastic Fragments for Dense Scene Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision.

[39]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[40]  S. Shankar Sastry,et al.  An Invitation to 3-D Vision , 2004 .

[41]  Marsette Vona,et al.  Moving Volume KinectFusion , 2012, BMVC.

[42]  John J. Leonard,et al.  Deformation-based loop closure for large scale dense RGB-D SLAM , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[43]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[44]  Federico Tombari,et al.  Performance Evaluation of 3D Keypoint Detectors , 2012, International Journal of Computer Vision.

[45]  Andrew I. Comport,et al.  On unifying key-frame and voxel-based dense visual SLAM at large scales , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[46]  Daniel Cremers,et al.  Robust odometry estimation for RGB-D cameras , 2013, 2013 IEEE International Conference on Robotics and Automation.

[47]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[48]  Vladlen Koltun,et al.  Robust reconstruction of indoor scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[50]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[51]  Luís A. Alexandre 3D Descriptors for Object and Category Recognition: a Comparative Evaluation , 2012 .

[52]  Horst Bischof,et al.  GPSlam: Marrying Sparse Geometric and Dense Probabilistic Visual Mapping , 2011, BMVC.

[53]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.