Interactive 3D modeling of indoor environments with a consumer depth camera

Detailed 3D visual models of indoor spaces, from walls and floors to objects and their configurations, can provide extensive knowledge about the environments as well as rich contextual information of people living therein. Vision-based 3D modeling has only seen limited success in applications, as it faces many technical challenges that only a few experts understand, let alone solve. In this work we utilize (Kinect style) consumer depth cameras to enable non-expert users to scan their personal spaces into 3D models. We build a prototype mobile system for 3D modeling that runs in real-time on a laptop, assisting and interacting with the user on-the-fly. Color and depth are jointly used to achieve robust 3D registration. The system offers online feedback and hints, tolerates human errors and alignment failures, and helps to obtain complete scene coverage. We show that our prototype system can both scan large environments (50 meters across) and at the same time preserve fine details (centimeter accuracy). The capability of detailed 3D modeling leads to many promising applications such as accurate 3D localization, measuring dimensions, and interactive visualization.

[1]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[2]  Paul Debevec,et al.  Modeling and Rendering Architecture from Photographs , 1996, SIGGRAPH 1996.

[3]  Wolfram Burgard,et al.  Monte Carlo Localization: Efficient Position Estimation for Mobile Robots , 1999, AAAI/IAAI.

[4]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[5]  Stephen J. Maybank,et al.  A Method for Interactive 3D Reconstruction of Piecewise Planar Objects from Single Images , 1999, BMVC.

[6]  Paramvir Bahl,et al.  RADAR: an in-building RF-based user location and tracking system , 2000, Proceedings IEEE INFOCOM 2000. Conference on Computer Communications. Nineteenth Annual Joint Conference of the IEEE Computer and Communications Societies (Cat. No.00CH37064).

[7]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[8]  Gaetano Borriello,et al.  Location Systems for Ubiquitous Computing , 2001, Computer.

[9]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[11]  Sunny Consolvo,et al.  Self-Mapping in 802.11 Location Systems , 2005, UbiComp.

[12]  Bill N. Schilit,et al.  Place Lab: Device Positioning Using Radio Beacons in the Wild , 2005, Pervasive.

[13]  Eyal de Lara,et al.  Accurate GSM Indoor Localization , 2005, UbiComp.

[14]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Gregory D. Abowd,et al.  PowerLine Positioning: A Practical Sub-Room-Level Indoor Location System for Domestic Use , 2006, UbiComp.

[16]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[17]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[18]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[20]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[21]  Ian D. Reid,et al.  Mapping Large Loops with a Single Hand-Held Camera , 2007, Robotics: Science and Systems.

[22]  J. Ponce,et al.  Accurate, Dense, and Robust Multi-View Stereopsis , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Robert Harle,et al.  Pedestrian localisation for indoor environments , 2008, UbiComp.

[24]  Marc Pollefeys,et al.  Interactive 3D architectural modeling from unordered photo collections , 2008, SIGGRAPH 2008.

[25]  Romit Roy Choudhury,et al.  SurroundSense: mobile phone localization via ambience fingerprinting , 2009, MobiCom '09.

[26]  Richard Szeliski,et al.  Reconstructing building interiors from images , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[28]  Vincent Lepetit,et al.  View-based Maps , 2010, Int. J. Robotics Res..

[29]  Richard Szeliski,et al.  Manhattan-world stereo , 2009, CVPR.

[30]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Andrew J. Davison,et al.  Live dense reconstruction with a single moving camera , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Dieter Fox,et al.  RGB-D Mapping: Using Depth Cameras for Dense 3D Modeling of Indoor Environments , 2010, ISER.