Towards Internet-scale multi-view stereo

This paper introduces an approach for enabling existing multi-view stereo methods to operate on extremely large unstructured photo collections. The main idea is to decompose the collection into a set of overlapping sets of photos that can be processed in parallel, and to merge the resulting reconstructions. This overlapping clustering problem is formulated as a constrained optimization and solved iteratively. The merging algorithm, designed to be parallel and out-of-core, incorporates robust filtering steps to eliminate low-quality reconstructions and enforce global visibility constraints. The approach has been tested on several large datasets downloaded from Flickr.com, including one with over ten thousand images, yielding a 3D reconstruction with nearly thirty million points.

[1]  Marc Levoy,et al.  A volumetric method for building complex models from range images , 1996, SIGGRAPH.

[2]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  Marc Levoy,et al.  QSplat: a multiresolution point rendering system for large meshes , 2000, SIGGRAPH.

[4]  Francis Schmitt,et al.  Silhouette and stereo fusion for 3D object modeling , 2003, Fourth International Conference on 3-D Digital Imaging and Modeling, 2003. 3DIM 2003. Proceedings..

[5]  Joydeep Ghosh,et al.  Model-based overlapping clustering , 2005, KDD '05.

[6]  Roberto Cipolla,et al.  Multi-view stereo via volumetric graph-cuts , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Michael M. Kazhdan,et al.  Poisson surface reconstruction , 2006, SGP '06.

[8]  Richard Szeliski,et al.  A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[9]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[10]  Randal C. Burns,et al.  Multilevel streaming for out-of-core surface reconstruction , 2007, Symposium on Geometry Processing.

[11]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[13]  Michael Goesele,et al.  Multi-View Stereo for Community Photo Collections , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[14]  Jan-Michael Frahm,et al.  Real-Time Plane-Sweeping Stereo with Multiple Sweeping Directions , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Horst Bischof,et al.  A Globally Optimal Algorithm for Robust TV-L1 Range Image Integration , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[16]  Jan-Michael Frahm,et al.  Variable baseline/resolution stereo , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Jean Ponce,et al.  Accurate Camera Calibration from Multi-View Stereo and Bundle Adjustment , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Tomás Pajdla,et al.  Scalable multi-view stereo , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[19]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[20]  Jana Kosecka,et al.  Piecewise planar city 3D modeling from street view panoramic sequences , 2009, CVPR.

[21]  Jean Ponce,et al.  Accurate, Dense, and Robust Multiview Stereopsis , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .