Cloud-based collaborative 3D reconstruction using smartphones

This article presents a pipeline that enables multiple users to collaboratively acquire images with monocular smartphones and derive a 3D point cloud using a remote reconstruction server. A set of key images are automatically selected from each smartphone's camera video feed as multiple users record different viewpoints of an object, concurrently or at different time instants. Selected images are automatically processed and registered with an incremental Structure from Motion (SfM) algorithm in order to create a 3D model. Our incremental SfM approach enables on-the-fly feedback to the user to be generated about current reconstruction progress. Feedback is provided in the form of a preview window showing the current 3D point cloud, enabling users to see if parts of a surveyed scene need further attention/coverage whilst they are still in situ. We evaluate our 3D reconstruction pipeline by performing experiments in uncontrolled and unconstrained real-world scenarios. Datasets are publicly available.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Laurent Kneip,et al.  Collaborative monocular SLAM with multiple Micro Aerial Vehicles , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Victor Bucha,et al.  3DCapture: 3D Reconstruction for a Smartphone , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[5]  Patrik Schmuck,et al.  Multi-UAV collaborative monocular SLAM , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Anna Hilsmann,et al.  A SMARTPHONE-BASED 3D PIPELINE FOR THE CREATIVE INDUSTRY– THE REPLICATE EU PROJECT , 2017 .

[7]  Silvio Savarese,et al.  Semantic structure from motion , 2011, CVPR 2011.

[8]  Andrea Fusiello,et al.  Improving the efficiency of hierarchical structure-and-motion , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Pushmeet Kohli,et al.  MobileFusion: Real-Time Volumetric Surface Reconstruction and Dense Tracking on Mobile Phones , 2015, IEEE Transactions on Visualization and Computer Graphics.

[10]  Marc Pollefeys,et al.  Live Metric 3D Reconstruction on Mobile Phones , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Jim H. Chandler,et al.  Automatic detection of blurred images in UAV image sets , 2016 .

[12]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[13]  Horst Bischof,et al.  Towards Wiki-based Dense City Modeling , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Torsten Sattler,et al.  A Scalable Collaborative Online System for City Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[15]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[16]  Gabe Sibley,et al.  MOARSLAM: Multiple Operator Augmented RSLAM , 2014, DARS.

[17]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Olaf Kähler,et al.  Real-Time 3D Tracking and Reconstruction on Mobile Phones , 2015, IEEE Transactions on Visualization and Computer Graphics.

[19]  J. Shan,et al.  INTEGRATING SMARTPHONE IMAGES AND AIRBORNE LIDAR DATA FOR COMPLETE URBAN BUILDING MODELLING , 2016 .

[20]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Luc Van Gool,et al.  Mobile phone and cloud — A dream team for 3D reconstruction , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[24]  Patrice E. Carbonneau,et al.  Cost‐effective non‐metric photogrammetry from consumer‐grade sUAS: implications for direct georeferencing of structure from motion photogrammetry , 2017 .

[25]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Tobias Höllerer,et al.  Optimizing the Viewing Graph for Structure-from-Motion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27]  R. Reulke,et al.  Remote Sensing and Spatial Information Sciences , 2005 .

[28]  Marc Pollefeys,et al.  Turning Mobile Phones into 3D Scanners , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[31]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.