Efficient tree-structured SfM by RANSAC generalized Procrustes analysis

A tree-structured SfM by RANSAC generalized Procrustes analysis (RGPA) is proposed.RGPA is able to reliably merge multiple structures at a time and remove outliers.Quick and robust bottom-up reconstruction is achieved with a shallow tree. This paper proposes a tree-structured structure-from-motion (SfM) method that recovers 3D scene structures and estimates camera poses from unordered image sets. Starting from atomic structures spanning the scene, we build well-connected structure groups, and propose RANSAC generalized Procrustes analysis (RGPA) to glue structures in the same group. The grouping-aligning operations hierarchically proceed until the full scene is reconstructed. Our work is the first attempt of using GPA for modern 3D reconstruction tasks. RGPA is able to merge multiple structures at a time and automatically identify outliers. The reconstruction tree is much more compact and balanced than previous hierarchical SfM methods and has a very shallow depth. These advantages, along with the resulting removal of intermediate bundle adjustments, lead to significantly improved computational efficiency over state-of-the-art SfM methods. The cameras and 3D scene can be robustly recovered in the presence of moderate noise. We verify the efficacy of our method on a variety of datasets, and demonstrate that our method is able to produce metric reconstructions efficiently and robustly.

[1]  Andrea Fusiello,et al.  Improving the efficiency of hierarchical structure-and-motion , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Venu Madhav Govindu,et al.  Efficient and Robust Large-Scale Rotation Averaging , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Richard Szeliski,et al.  Skeletal graphs for efficient structure from motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Michal Havlena,et al.  Randomized structure from motion based on atomic 3D models from camera triplets , 2009, CVPR.

[6]  Anders P. Eriksson,et al.  Outlier removal using duality , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[8]  Marc Pollefeys,et al.  Disambiguating visual relations using loop constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Subhashis Banerjee,et al.  Divide and Conquer: Efficient Large-Scale Structure from Motion Using Graph Partitioning , 2014, ACCV.

[10]  Adrien Bartoli,et al.  Global optimization for optimal generalized procrustes analysis , 2011, CVPR 2011.

[11]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[12]  Marc Pollefeys,et al.  Joint 3D Scene Reconstruction and Class Segmentation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Andrew Owens,et al.  Discrete-continuous optimization for large-scale structure from motion , 2011, CVPR 2011.

[14]  Ira Kemelmacher-Shlizerman,et al.  Global Motion Estimation from Point Matches , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[15]  Pascal Monasse,et al.  Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion , 2013, ICCV.

[16]  Roberto Scopigno,et al.  Fully Automatic Registration of Image Sets on Approximate Geometry , 2012, International Journal of Computer Vision.

[17]  Tomás Pajdla,et al.  Robust Rotation and Translation Estimation in Multiview Reconstruction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[19]  Carl Olsson,et al.  Stable Structure from Motion for Unordered Image Collections , 2011, SCIA.

[20]  R. Prim Shortest connection networks and some generalizations , 1957 .

[21]  Ping Tan,et al.  A Global Linear Method for Camera Pose Registration , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Fabio Crosilla,et al.  A Forward Search Method for Robust Generalised Procrustes Analysis , 2006 .

[23]  Manolis I. A. Lourakis,et al.  SBA: A software package for generic sparse bundle adjustment , 2009, TOMS.

[24]  Andrew Owens,et al.  Discrete-continuous optimization for large-scale structure from motion , 2011, CVPR.

[25]  Matthew A. Brown,et al.  Unsupervised 3D object recognition and reconstruction in unordered datasets , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[26]  Richard Szeliski,et al.  A Multi-stage Linear Approach to Structure from Motion , 2010, ECCV Workshops.

[27]  Richard Szeliski,et al.  Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Jan-Michael Frahm,et al.  Correcting for Duplicate Scene Structure in Sparse 3D Reconstruction , 2014, ECCV.

[29]  Venu Madhav Govindu,et al.  Robustness in Motion Averaging , 2006, ACCV.

[30]  James M. Rehg,et al.  Adaptive Structure from Motion with a Contrario Model Estimation , 2012, ACCV.

[31]  Johannes Gehrke,et al.  MatchMiner: Efficient Spanning Structure Mining in Large Image Collections , 2012, ECCV.

[32]  Samir Khuller,et al.  Approximation Algorithms for Connected Dominating Sets , 1996, Algorithmica.

[33]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[34]  Sunglok Choi,et al.  Performance Evaluation of RANSAC Family , 2009, BMVC.

[35]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[37]  Noah Snavely,et al.  Robust Global Translations with 1DSfM , 2014, ECCV.

[38]  David Nistér,et al.  Reconstruction from Uncalibrated Sequences with a Hierarchy of Trifocal Tensors , 2000, ECCV.

[39]  Jan-Michael Frahm,et al.  Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs , 2008, ECCV.

[40]  Hongdong Li,et al.  Rotation Averaging , 2013, International Journal of Computer Vision.

[41]  Michal Havlena,et al.  Efficient Structure from Motion by Graph Optimization , 2010, ECCV.

[42]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[43]  Jan-Michael Frahm,et al.  RECON: Scale-adaptive robust estimation via Residual Consensus , 2011, 2011 International Conference on Computer Vision.

[44]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[45]  Michal Havlena,et al.  Randomized structure from motion based on atomic 3D models from camera triplets , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Carl Olsson,et al.  Non-sequential structure from motion , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[47]  Fabio Crosilla,et al.  Use of generalised Procrustes analysis for the photogrammetric block adjustment by independent models , 2002 .

[48]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[49]  P. J. Narayanan,et al.  Multistage SFM: Revisiting Incremental Structure from Motion , 2014, 2014 2nd International Conference on 3D Vision.

[50]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[51]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[52]  Robert B. Fisher,et al.  Estimating 3-D rigid body transformations: a comparison of four major algorithms , 1997, Machine Vision and Applications.