Floorplan Priors for Joint Camera Pose and Room Layout Estimation

We present a novel approach to reconstruct large or featureless scenes. Our method jointly estimates camera poses and a room layout from a set of partial reconstructions due to camera tracking interruptions when scanning a large or featureless scene. Unlike the existing methods relying on feature point matching to localize the camera, we exploit the 3D "box" structure of a typical room layout that meets the Manhattan World property. We first estimate a local layout for each partial scan separately and then combine these local layouts to form a globally aligned layout with loop closure. We validate our method quantitatively and qualitatively on real and synthetic scenes of various sizes and complexities. The evaluations and comparisons show superior effectiveness and accuracy of our method.

[1]  Derek Hoiem,et al.  Recovering the spatial layout of cluttered rooms , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[2]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Matthias Nießner,et al.  3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jörg Stückler,et al.  CPA-SLAM: Consistent plane-model alignment for direct RGB-D SLAM , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Ricardo Cabral,et al.  Piecewise Planar and Compact Floorplan Reconstruction from Images , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Matthias Nießner,et al.  PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction , 2018, ECCV.

[8]  Jian Zhang,et al.  Estimating the 3D Layout of Indoor Scenes and Its Clutter from Depth Sensors , 2013, 2013 IEEE International Conference on Computer Vision.

[9]  Wolfram Burgard,et al.  Coordinated multi-robot exploration using a segmentation of the environment , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Sanja Fidler,et al.  Box in the Box: Joint 3D Layout and Object Reasoning from Single Images , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Chen Feng,et al.  Point-plane SLAM for hand-held 3D sensors , 2013, 2013 IEEE International Conference on Robotics and Automation.

[12]  H. Jin Kim,et al.  Visual Odometry with Drift-Free Rotation Estimation Using Indoor Scene Regularities , 2017, BMVC.

[13]  Fatih Murat Porikli,et al.  Indoor Scene Understanding in 2.5/3D: A Survey , 2018, ArXiv.

[14]  Hang Yang,et al.  Structured Indoor Modeling , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Cyrill Stachniss,et al.  Analyzing the quality of matched 3D point clouds of objects , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[16]  Bernard Ghanem,et al.  Robust Manhattan Frame estimation from a single RGB-D image , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Hui Zhang,et al.  Efficient 3D Room Shape Recovery from a Single Panorama , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Nico Blodow,et al.  Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[19]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[21]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[22]  Marc Pollefeys,et al.  Efficient structured prediction for 3D indoor scene understanding , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Erik Wijmans,et al.  Exploiting 2D Floorplan for Building-Scale Panorama RGBD Alignment , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  John J. Leonard,et al.  Real-time manhattan world rotation estimation in 3D , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[25]  Derek Hoiem,et al.  LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Silvio Savarese,et al.  Understanding Indoor Scenes Using 3D Geometric Phrases , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Fuhua Cheng,et al.  Surface Reconstruction from Point Clouds , 1998, SSM.

[28]  Takeo Kanade,et al.  Geometric reasoning for single image structure recovery , 2009, CVPR.

[29]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[30]  Niloy J. Mitra,et al.  RAPter , 2015, ACM Trans. Graph..

[31]  Chen Liu,et al.  FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans , 2018, ECCV.

[32]  Ahmed M. Elgammal,et al.  Line-based relative pose estimation , 2011, CVPR 2011.

[33]  Kuk-Jin Yoon,et al.  Joint Layout Estimation and Global Multi-view Registration for Indoor Reconstruction , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34]  Friedrich Fraundorfer,et al.  Automatic Alignment of Indoor and Outdoor Building Models Using 3D Line Segments , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Thomas A. Funkhouser,et al.  Fine-to-Coarse Global Registration of RGB-D Scans , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Tomasz Malisiewicz,et al.  RoomNet: End-to-End Room Layout Estimation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[37]  Javier Civera,et al.  DPPTAM: Dense piecewise planar tracking and mapping from a monocular sequence , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[38]  Peter Wonka,et al.  PolyFit: Polygonal Surface Reconstruction from Point Clouds , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  Matthias Nießner,et al.  BundleFusion , 2016, TOGS.

[40]  Kyungdon Joo,et al.  Globally Optimal Manhattan Frame Estimation in Real-Time , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Anthony Cowley,et al.  Parsing Indoor Scenes Using RGB-D Imagery , 2012, Robotics: Science and Systems.

[42]  Stefan Leutenegger,et al.  ElasticFusion: Real-time dense SLAM and light source estimation , 2016, Int. J. Robotics Res..

[43]  Davide Scaramuzza,et al.  SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Marc Pollefeys,et al.  Indoor Scan2BIM: Building information models of house interiors , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[46]  Paul H. J. Kelly,et al.  Dense planar SLAM , 2014, ISMAR.

[47]  Peter Wonka,et al.  Manhattan-World Urban Reconstruction from Point Clouds , 2016, ECCV.

[48]  Yinda Zhang,et al.  PanoContext: A Whole-Room 3D Context Model for Panoramic Scene Understanding , 2014, ECCV.

[49]  Leonidas J. Guibas,et al.  3Dlite , 2017, ACM Trans. Graph..

[50]  Vladlen Koltun,et al.  Fast Global Registration , 2016, ECCV.