Joint Layout Estimation and Global Multi-view Registration for Indoor Reconstruction

In this paper, we propose a novel method to jointly solve scene layout estimation and global registration problems for accurate indoor 3D reconstruction. Given a sequence of range data, we first build a set of scene fragments using KinectFusion and register them through pose graph optimization. Afterwards, we alternate between layout estimation and layout-based global registration processes in iterative fashion to complement each other. We extract the scene layout through hierarchical agglomerative clustering and energy-based multi-model fitting in consideration of noisy measurements. Having the estimated scene layout in one hand, we register all the range data through the global iterative closest point algorithm where the positions of 3D points that belong to the layout such as walls and a ceiling are constrained to be close to the layout. We experimentally verify the proposed method with the publicly available synthetic and real-world datasets in both quantitative and qualitative ways.

[1]  John J. Leonard,et al.  Deformation-based loop closure for large scale dense RGB-D SLAM , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Silvio Savarese,et al.  Understanding Indoor Scenes Using 3D Geometric Phrases , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[6]  Vladlen Koltun,et al.  Robust reconstruction of indoor scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[8]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Paul H. J. Kelly,et al.  Dense planar SLAM , 2014, 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[10]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[11]  M. Iwanowski Morphological Boundary Pixel Classification , 2007, EUROCON 2007 - The International Conference on "Computer as a Tool".

[12]  Ulrich Neumann,et al.  2.5D building modeling by discovering global regularities , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Florent Lafarge,et al.  LOD Generation for Urban Scenes , 2015, ACM Trans. Graph..

[14]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[15]  Marsette Vona,et al.  Moving Volume KinectFusion , 2012, BMVC.

[16]  Jan-Michael Frahm,et al.  Exploring High-Level Plane Primitives for Indoor 3D Reconstruction with a Hand-held RGB-D Camera , 2012, ACCV Workshops.

[17]  S. Savarese,et al.  Supplemental Material : Understanding Indoor Scenes using 3 D Geometric Phrases , 2013 .

[18]  Andrea Fusiello,et al.  Global Registration of 3D Point Sets via LRS Decomposition , 2016, ECCV.

[19]  Jieqing Feng,et al.  Hierarchical Multiview Rigid Registration , 2015, SGP '15.

[20]  Takeo Kanade,et al.  Estimating Spatial Layout of Rooms using Volumetric Reasoning about Objects and Surfaces , 2010, NIPS.

[21]  Dieter Fox,et al.  Patch Volumes: Segmentation-Based Consistent Mapping with RGB-D Cameras , 2013, 2013 International Conference on 3D Vision.

[22]  Kun Zhou,et al.  Online Structure Analysis for Real-Time Indoor Scene Reconstruction , 2015, ACM Trans. Graph..

[23]  Chen Feng,et al.  Point-plane SLAM for hand-held 3D sensors , 2013, 2013 IEEE International Conference on Robotics and Automation.

[24]  Yasuyuki Matsushita,et al.  Efficient Large-Scale Point Cloud Registration Using Loop Closures , 2015, 2015 International Conference on 3D Vision.

[25]  Andreas Geiger,et al.  Joint 3D Estimation of Objects and Scene Layout , 2011, NIPS.

[26]  John A. Williams,et al.  Simultaneous registration of multiple point sets using orthonormal matrices , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[27]  Robert Bergevin,et al.  Towards a General Multi-View Registration Technique , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Gérard G. Medioni,et al.  Object modelling by registration of multiple range images , 1992, Image Vis. Comput..

[29]  Jianxiong Xiao,et al.  Reconstructing the World’s Museums , 2012, International Journal of Computer Vision.

[30]  Xuming He,et al.  Indoor scene structure analysis for single image depth estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[32]  Damir Filko,et al.  Place recognition based on matching of planar surfaces and line segments , 2015, Int. J. Robotics Res..

[33]  Esra Ataer Cansizoglu,et al.  Tracking an RGB-D Camera Using Points and Planes , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[34]  Andrea Fusiello,et al.  T-Linkage: A Continuous Relaxation of J-Linkage for Multi-model Fitting , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Daniel Cremers,et al.  Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences , 2013, 2013 IEEE International Conference on Computer Vision.

[36]  Gérard G. Medioni,et al.  Object modeling by registration of multiple range images , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[37]  Sven Oesau,et al.  Planar Shape Detection and Regularization in Tandem , 2016, Comput. Graph. Forum.

[38]  Javier González,et al.  Fast place recognition with plane-based maps , 2013, 2013 IEEE International Conference on Robotics and Automation.

[39]  Stefan Leutenegger,et al.  ElasticFusion: Dense SLAM Without A Pose Graph , 2015, Robotics: Science and Systems.

[40]  Silvio Savarese,et al.  Free your Camera: 3D Indoor Scene Understanding from Arbitrary Camera Motion , 2013, BMVC.

[41]  K. Ikeuchi,et al.  Robust Simultaneous Registration of Multiple Range Images , 2008 .

[42]  Vladlen Koltun,et al.  Simultaneous Localization and Calibration: Self-Calibration of Consumer Depth Cameras , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[43]  B. Ripley,et al.  Robust Statistics , 2018, Encyclopedia of Mathematical Geosciences.

[44]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[45]  Ligang Liu,et al.  Globally consistent rigid registration , 2014, Graph. Model..

[46]  Kari Pulli,et al.  Multiview registration for large data sets , 1999, Second International Conference on 3-D Digital Imaging and Modeling (Cat. No.PR00062).

[47]  Jiawen Chen,et al.  Scalable real-time volumetric surface reconstruction , 2013, ACM Trans. Graph..

[48]  Joachim Hertzberg,et al.  Globally consistent 3D mapping with scan matching , 2008, Robotics Auton. Syst..

[49]  Richard Szeliski,et al.  Manhattan-world stereo , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Leonidas J. Guibas,et al.  Robust global registration , 2005, SGP '05.

[51]  Florentin Wörgötter,et al.  Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Jörg Stückler,et al.  CPA-SLAM: Consistent plane-model alignment for direct RGB-D SLAM , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).