Multistage SFM: A Coarse-to-Fine Approach for 3D Reconstruction

Several methods have been proposed for large-scale 3D reconstruction from large, unorganized image collections. A large reconstruction problem is typically divided into multiple components which are reconstructed independently using structure from motion (SFM) and later merged together. Incremental SFM methods are most popular for the basic structure recovery of a single component. They are robust and effective but are strictly sequential in nature. We present a multistage approach for SFM reconstruction of a single component that breaks the sequential nature of the incremental SFM methods. Our approach begins with quickly building a coarse 3D model using only a fraction of features from given images. The coarse model is then enriched by localizing remaining images and matching and triangulating remaining features in subsequent stages. These stages are made efficient and highly parallel by leveraging the geometry of the coarse model. Our method produces similar quality models as compared to incremental SFM methods while being notably fast and parallel.

[1]  Richard Szeliski,et al.  Bundle Adjustment in the Large , 2010, ECCV.

[2]  Venu Madhav Govindu,et al.  Efficient and Robust Large-Scale Rotation Averaging , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Tobias Höllerer,et al.  Optimizing the Viewing Graph for Structure-from-Motion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Jan-Michael Frahm,et al.  Building Rome on a Cloudless Day , 2010, ECCV.

[5]  Manolis I. A. Lourakis,et al.  SBA: A software package for generic sparse bundle adjustment , 2009, TOMS.

[6]  Jan-Michael Frahm,et al.  Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs , 2008, International Journal of Computer Vision.

[7]  Sunil Arya,et al.  An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[8]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[9]  Konrad Schindler,et al.  VocMatch: Efficient Multiview Correspondence for Structure from Motion , 2014, ECCV.

[10]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[11]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[12]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[13]  P. J. Narayanan,et al.  Visibility Probability Structure from SfM Datasets and Applications , 2012, ECCV.

[14]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[15]  Michal Havlena,et al.  Randomized structure from motion based on atomic 3D models from camera triplets , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Takeo Kanade,et al.  Shape and motion from image streams under orthography: a factorization method , 1992, International Journal of Computer Vision.

[17]  Richard Szeliski,et al.  Recovering 3D Shape and Motion from Image Streams Using Nonlinear Least Squares , 1994, J. Vis. Commun. Image Represent..

[18]  P. J. Narayanan,et al.  Multistage SFM: Revisiting Incremental Structure from Motion , 2014, 2014 2nd International Conference on 3D Vision.

[19]  Johannes Gehrke,et al.  MatchMiner: Efficient Spanning Structure Mining in Large Image Collections , 2012, ECCV.

[20]  Hanqing Lu,et al.  Fast and Accurate Image Matching with Cascade Hashing for 3D Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Richard Szeliski,et al.  Recovering 3D shape and motion from image streams using nonlinear least squares , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Torsten Sattler,et al.  Merging the Unmatchable: Stitching Visually Disconnected SfM Models , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Marc Pollefeys,et al.  Discovering and exploiting 3D symmetries in structure from motion , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[25]  Noah Snavely,et al.  Learning to Match Images in Large-Scale Collections , 2012, ECCV Workshops.

[26]  Subhashis Banerjee,et al.  Divide and Conquer: Efficient Large-Scale Structure from Motion Using Graph Partitioning , 2014, ACCV.

[27]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[28]  Torsten Sattler,et al.  Improving Image-Based Localization by Active Correspondence Search , 2012, ECCV.

[29]  Andrew Owens,et al.  Discrete-continuous optimization for large-scale structure from motion , 2011, CVPR.

[30]  Jiri Matas,et al.  Large-Scale Discovery of Spatially Related Images , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Horst Bischof,et al.  From structure-from-motion point clouds to fast location recognition , 2009, CVPR.

[32]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[33]  P. J. Narayanan,et al.  Some GPU Algorithms for Graph Connected Components and Spanning Tree , 2010, Parallel Process. Lett..

[34]  C. J. Taylor,et al.  Structure and motion in two dimensions from multiple images: a least squares approach , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[35]  Matthew A. Brown,et al.  Unsupervised 3D object recognition and reconstruction in unordered datasets , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[36]  Richard Szeliski,et al.  A Multi-stage Linear Approach to Structure from Motion , 2010, ECCV Workshops.

[37]  Andrea Fusiello,et al.  Improving the efficiency of hierarchical structure-and-motion , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Noah Snavely,et al.  Robust Global Translations with 1DSfM , 2014, ECCV.

[39]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[40]  Michal Havlena,et al.  Efficient Structure from Motion by Graph Optimization , 2010, ECCV.

[41]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[42]  Daniel P. Huttenlocher,et al.  Location Recognition Using Prioritized Feature Matching , 2010, ECCV.

[43]  Pascal Monasse,et al.  Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion , 2013, ICCV.

[44]  Jan-Michael Frahm,et al.  PAIGE: PAirwise Image Geometry Encoding for improved efficiency in Structure-from-Motion , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Torsten Sattler,et al.  Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[46]  Martin Byröd,et al.  Conjugate Gradient Bundle Adjustment , 2010, ECCV.

[47]  Pascal Fua,et al.  LDAHash: Improved Matching with Smaller Descriptors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Peter F. Sturm,et al.  A Factorization Based Algorithm for Multi-Image Projective Structure and Motion , 1996, ECCV.

[49]  Carl Olsson,et al.  Stable Structure from Motion for Unordered Image Collections , 2011, SCIA.

[50]  David G. Lowe,et al.  Scalable Nearest Neighbor Algorithms for High Dimensional Data , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  P. J. Narayanan,et al.  Geometry-Aware Feature Matching for Structure from Motion Applications , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[52]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[53]  Konrad Schindler,et al.  Predicting Matchability , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Richard Szeliski,et al.  Skeletal graphs for efficient structure from motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.