Guided Capturing of Multi-view Stereo Datasets

We present an application for mobile devices, that allows any user, even without background in computer vision, to capture a complete set of images, that is suitable for a multi-view stereo reconstruction. Compared to related tasks, such as panorama capture, this setting is much harder, as the camera needs to move unrestricted in 3D space. Our system uses structure from motion to register captured images and generates a sparse reconstruction of the scene. The dataset is built in an incremental procedure, where the next best view is computed with a novel view planning strategy, that aims for a good coverage of the scene. The user is then guided towards the new view, and the image is captured automatically at the right position. The next iteration starts after the reconstruction has been updated. The quality of the resulting dataset is on par with datasets captured by an expert user.

[1]  Hugh F. Durrant-Whyte,et al.  Simultaneous map building and localization for an autonomous mobile robot , 1991, Proceedings IROS '91:IEEE/RSJ International Workshop on Intelligent Robots and Systems '91.

[2]  Michael Goesele,et al.  Multi-View Stereo for Community Photo Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[3]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[4]  Heinrich Niemann,et al.  Active Visual Object Reconstruction using D-, E-, and T-Optimal Next Best Views , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Jean-Philippe Pons,et al.  Towards high-resolution large-scale multi-view stereo , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[7]  Jan-Michael Frahm,et al.  Next Best View Planning for Active Model Improvement , 2009, BMVC.

[8]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  David W. Murray,et al.  Parallel Tracking and Mapping on a camera phone , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[10]  Frédo Durand,et al.  Computational rephotography , 2010, TOGS.

[11]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  M. J. Box A New Method of Constrained Optimization and a Comparison With Other Methods , 1965, Comput. J..

[13]  Joachim Denzler,et al.  Online Next-Best-View Planning for Accuracy Optimization Using an Extended E-Criterion , 2010, 2010 20th International Conference on Pattern Recognition.

[14]  Paul Newman,et al.  SLAM-Loop Closing with Visually Salient Features , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[15]  Marcus A. Magnor,et al.  Fast Ray/Axis-Aligned Bounding Box Overlap Tests using Ray Slopes , 2007, J. Graph. Tools.

[16]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[17]  Horst Bischof,et al.  Online Feedback for Structure-from-Motion Image Acquisition , 2012, BMVC.