Accurate, Scalable and Parallel Structure from Motion

In this paper, we tackle the accurate Structure from Motion (SfM) problem, in particular camera registration, far exceeding the memory of a single compute node. Different from the previous methods which drastically simplify the parameters of SfM, we preserve as many cameras, tracks and their corresponding connectivity as possible for a highly consistent and accurate SfM. By means of a camera clustering algorithm, we divide all the cameras and associated images into clusters and leverage such formulation to process the subsequent track generation, local incremental SfM and final bundle adjustment in a scalable and parallel scheme. Taking the advantages of both incremental and global SfM methods, we apply the relative motions from local incremental SfM to the global motion averaging framework and obtain more accurate and robust global camera poses than the state-of-the-art methods. We intensively demonstrate the superior performance of our method on the benchmark, Internet and our own challenging city-scale data-sets.

[1]  Pascal Monasse,et al.  Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion , 2013, ICCV.

[2]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[3]  Anders P. Eriksson,et al.  A Consensus-Based Framework for Distributed Bundle Adjustment , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Andrew W. Fitzgibbon,et al.  Automatic Camera Recovery for Closed or Open Image Sequences , 1998, ECCV.

[5]  Richard Szeliski,et al.  Structure from motion for scenes with large duplicate structures , 2011, CVPR 2011.

[6]  Rama Chellappa,et al.  A Scalable Projective Bundle Adjustment Algorithm using the L infinity Norm , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[7]  Venu Madhav Govindu,et al.  Combining two-view constraints for motion estimation , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[8]  Jiri Matas,et al.  Randomized RANSAC with Td, d test , 2004, Image Vis. Comput..

[9]  Horst Bischof,et al.  What can missing correspondences tell us about 3D structure and motion? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Hongdong Li,et al.  Rotation Averaging , 2013, International Journal of Computer Vision.

[11]  Michal Havlena,et al.  Efficient Structure from Motion by Graph Optimization , 2010, ECCV.

[12]  Andrea Fusiello,et al.  Hierarchical structure-and-motion recovery from uncalibrated images , 2015, Comput. Vis. Image Underst..

[13]  David Martin,et al.  Street View Motion-from-Structure-from-Motion , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Zhaopeng Cui,et al.  Linear Global Translation Estimation with Feature Tracks , 2015, BMVC.

[15]  Anders Heyden,et al.  Covariance Propagation and Next Best View Planning for 3D Reconstruction , 2012, ECCV.

[16]  Jan-Michael Frahm,et al.  Building Rome on a Cloudless Day , 2010, ECCV.

[17]  Steven M. Seitz,et al.  Multicore bundle adjustment , 2011, CVPR 2011.

[18]  Jan-Michael Frahm,et al.  Next Best View Planning for Active Model Improvement , 2009, BMVC.

[19]  Xiaomin Duan,et al.  Riemannian Means on Special Euclidean Group and Unipotent Matrices Group , 2013, TheScientificWorldJournal.

[20]  Loong Fah Cheong,et al.  Seeing double without confusion: Structure-from-motion in highly ambiguous scenes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Onur Özyesil,et al.  Robust camera location estimation by convex programming , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Reinhard Koch,et al.  Visual Modeling with a Hand-Held Camera , 2004, International Journal of Computer Vision.

[24]  Noah Snavely,et al.  Robust Global Translations with 1DSfM , 2014, ECCV.

[25]  Jiri Matas,et al.  Randomized RANSAC with T(d, d) test , 2002, BMVC.

[26]  Frank Dellaert,et al.  Initialization techniques for 3D SLAM: A survey on rotation estimation and its use in pose graph optimization , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[28]  Benjamin Resch,et al.  Scalable structure from motion for densely sampled videos , 2015, CVPR 2015.

[29]  Ira Kemelmacher-Shlizerman,et al.  Global Motion Estimation from Point Matches , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[30]  Changchang Wu,et al.  SiftGPU : A GPU Implementation of Scale Invariant Feature Transform (SIFT) , 2007 .

[31]  Noah Snavely,et al.  Network Principles for SfM: Disambiguating Repeated Structures with Local Context , 2013, 2013 IEEE International Conference on Computer Vision.

[32]  Marc Pollefeys,et al.  Disambiguating visual relations using loop constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[33]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[34]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[35]  Ping Tan,et al.  Global Structure-from-Motion by Similarity Averaging , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  Jan-Michael Frahm,et al.  From single image query to detailed 3D reconstruction , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Inderjit S. Dhillon,et al.  Weighted Graph Cuts without Eigenvectors A Multilevel Approach , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Venu Madhav Govindu,et al.  Efficient and Robust Large-Scale Rotation Averaging , 2013, 2013 IEEE International Conference on Computer Vision.

[39]  Michal Havlena,et al.  Randomized structure from motion based on atomic 3D models from camera triplets , 2009, CVPR.

[40]  Pascal Monasse,et al.  UNORDERED FEATURE TRACKING MADE FAST AND EASY , 2011 .

[41]  Frank Dellaert,et al.  Out-of-Core Bundle Adjustment for Large-Scale 3D Reconstruction , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[42]  Long Quan,et al.  A quasi-dense approach to surface reconstruction from uncalibrated images , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Seth J. Teller,et al.  Spectral Solution of Large-Scale Extrinsic Camera Calibration as a Graph Embedding Problem , 2004, ECCV.

[44]  Peter Lindstrom Triangulation made easy , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[45]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[46]  Ping Tan,et al.  A Global Linear Method for Camera Pose Registration , 2013, 2013 IEEE International Conference on Computer Vision.

[47]  Hairong Kuang,et al.  The Hadoop Distributed File System , 2010, 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST).

[48]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[49]  Venu Madhav Govindu Lie-algebraic averaging for globally consistent motion estimation , 2004, CVPR 2004.

[50]  Venu Madhav Govindu,et al.  Robustness in Motion Averaging , 2006, ACCV.

[51]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[52]  Roland Siegwart,et al.  A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation , 2011, CVPR 2011.

[53]  Richard Szeliski,et al.  Skeletal graphs for efficient structure from motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[54]  Jan-Michael Frahm,et al.  Reconstructing the world* in six days , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Noah Snavely Photo Tourism : Exploring image collections in 3D , 2006 .

[57]  Peter F. Sturm,et al.  Exploiting Loops in the Graph of Trifocal Tensors for Calibrating a Network of Cameras , 2010, ECCV.

[58]  Tobias Höllerer,et al.  Optimizing the Viewing Graph for Structure-from-Motion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[59]  Inderjit S. Dhillon,et al.  Kernel k-means: spectral clustering and normalized cuts , 2004, KDD.

[60]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[61]  Tomás Pajdla,et al.  Robust Rotation and Translation Estimation in Multiview Reconstruction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.