A Combined Approach Toward Consistent Reconstructions of Indoor Spaces Based on 6D RGB-D Odometry and KinectFusion

We propose a 6D RGB-D odometry approach that finds the relative camera pose between consecutive RGB-D frames by keypoint extraction and feature matching both on the RGB and depth image planes. Furthermore, we feed the estimated pose to the highly accurate KinectFusion algorithm, which uses a fast ICP (Iterative Closest Point) to fine-tune the frame-to-frame relative pose and fuse the depth data into a global implicit surface. We evaluate our method on a publicly available RGB-D SLAM benchmark dataset by Sturm et al. The experimental results show that our proposed reconstruction method solely based on visual odometry and KinectFusion outperforms the state-of-the-art RGB-D SLAM system accuracy. Moreover, our algorithm outputs a ready-to-use polygon mesh (highly suitable for creating 3D virtual worlds) without any postprocessing steps.

[1]  Zhengyou Zhang,et al.  Iterative point matching for registration of free-form curves and surfaces , 1994, International Journal of Computer Vision.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[4]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[6]  William E. Lorensen,et al.  Marching cubes: A high resolution 3D surface construction algorithm , 1987, SIGGRAPH.

[7]  John J. Leonard,et al.  Robust Tracking for Real-Time Dense RGB-D Mapping with Kintinuous , 2012 .

[8]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[9]  Florian Schmidt,et al.  3D Registration for Verification of Humanoid Justin's Upper Body Kinematics , 2012, 2012 Ninth Conference on Computer and Robot Vision.

[10]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  S. Umeyama,et al.  Least-Squares Estimation of Transformation Parameters Between Two Point Patterns , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Athena Vakali,et al.  Contextual object category recognition for RGB-D scene labeling , 2014, Robotics Auton. Syst..

[13]  William Robson Schwartz,et al.  BRAND: A robust appearance and depth descriptor for RGB-D images , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  Kurt Konolige,et al.  CenSurE: Center Surround Extremas for Realtime Feature Detection and Matching , 2008, ECCV.

[15]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[16]  Qingxiong Yang,et al.  Recursive Bilateral Filtering , 2012, ECCV.

[17]  Gérard G. Medioni,et al.  Object modeling by registration of multiple range images , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[18]  Wolfram Burgard,et al.  An evaluation of the RGB-D SLAM system , 2012, 2012 IEEE International Conference on Robotics and Automation.

[19]  Friedrich Fraundorfer,et al.  Visual Odometry Part I: The First 30 Years and Fundamentals , 2022 .