Mobile phone and cloud — A dream team for 3D reconstruction

Recently, Structure-from-Motion pipelines (SfM) for the 3D reconstruction of scenes from images were pushed from desktop computers onto mobile devices, like phones or tablets. However, mobile devices offer much more than just necessary computational power. A combination of handheld device with camera, display and full connectivity entails possibilities for an on-line 3D reconstruction that would have been difficult to implement otherwise. In this work, we propose a combination of a regular mobile phone as frontend with a centralized server plus annex cloud as backend for collaborative, on-line 3D reconstruction. We illustrate few advantages of this combination of a myriad of new possibilities: First, we automatically balance computational load between the frontend and the backend depending on battery autonomy and available bandwidth. Second, we select the best of algorithms given the available resources to obtain better 3D models. Finally, we allow for collaborative modeling in order to arrive at more complete and more detailed models, especially when the objects or scenes are big. This paper presents an implementation of such a joint mobile-cloud modeling approach and demonstrates its advantages via real-life reconstructions.

[1]  Daniel Cremers,et al.  Semi-dense visual odometry for AR on a smartphone , 2014, 2014 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[2]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[3]  Luc Van Gool,et al.  Superpixel meshes for fast edge-preserving surface reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[5]  Horst Bischof,et al.  Incremental Surface Extraction from Sparse Structure-from-Motion Point Clouds , 2013, BMVC.

[6]  Torsten Sattler,et al.  A Scalable Collaborative Online System for City Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[7]  Luc Van Gool,et al.  A unified framework for content-aware view selection and planning through view importance , 2014, BMVC.

[8]  András Bódis-Szomorú,et al.  Fast, Approximate Piecewise-Planar Modeling Based on Sparse Structure-from-Motion and Superpixels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Horst Bischof,et al.  Online Feedback for Structure-from-Motion Image Acquisition , 2012, BMVC.

[10]  Ingmar Posner,et al.  Scheduled perception for energy-efficient path following , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Dieter Fritsch,et al.  Photogrammetric 3D reconstruction using mobile imaging , 2015, Electronic Imaging.

[12]  Reinhard Koch,et al.  Multi Viewpoint Stereo from Uncalibrated Video Sequences , 1998, ECCV.

[13]  Luc Van Gool,et al.  Learning Where to Classify in Multi-view Semantic Segmentation , 2014, ECCV.

[14]  Luc Van Gool,et al.  3D all the way: Semantic segmentation of urban scenes from start to end in 3D , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[16]  Davide Scaramuzza,et al.  SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[18]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[19]  Zoran Popovic,et al.  PhotoCity: training experts at large-scale image acquisition through a competitive game , 2011, CHI.

[20]  Torsten Sattler,et al.  Scalable 6-DOF Localization on Mobile Devices , 2014, ECCV.

[21]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Marc Levoy,et al.  Real-time 3D model acquisition , 2002, ACM Trans. Graph..

[23]  Tom Drummond,et al.  ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition , 2009, BMVC.

[24]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[25]  Wolfram Burgard,et al.  G2o: A general framework for graph optimization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[26]  Reinhard Koch,et al.  Hand-held acquisition of 3D models with a video camera , 1999, Second International Conference on 3-D Digital Imaging and Modeling (Cat. No.PR00062).

[27]  Laurent Kneip,et al.  Collaborative monocular SLAM with multiple Micro Aerial Vehicles , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  Marc Pollefeys,et al.  Turning Mobile Phones into 3D Scanners , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[30]  Luc Van Gool,et al.  Architectural decomposition for 3D landmark building understanding , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[31]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Heinrich Niemann,et al.  Active Visual Object Reconstruction using D-, E-, and T-Optimal Next Best Views , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[34]  Jan-Michael Frahm,et al.  Building Rome on a Cloudless Day , 2010, ECCV.

[35]  Jan-Michael Frahm,et al.  Reconstructing the world* in six days , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Luc Van Gool,et al.  Navigation using special buildings as signposts , 2014, MapInteract '14.