HSfM: Hybrid Structure-from-Motion

Structure-from-Motion (SfM) methods can be broadly categorized as incremental or global according to their ways to estimate initial camera poses. While incremental system has advanced in robustness and accuracy, the efficiency remains its key challenge. To solve this problem, global reconstruction system simultaneously estimates all camera poses from the epipolar geometry graph, but it is usually sensitive to outliers. In this work, we propose a new hybrid SfM method to tackle the issues of efficiency, accuracy and robustness in a unified framework. More specifically, we propose an adaptive community-based rotation averaging method first to estimate camera rotations in a global manner. Then, based on these estimated camera rotations, camera centers are computed in an incremental way. Extensive experiments show that our hybrid method performs similarly or better than many of the state-of-the-art global SfM approaches, in terms of computational efficiency, while achieves similar reconstruction accuracy and robustness with two other state-of-the-art incremental SfM approaches.

[1]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Richard I. Hartley,et al.  Multiple-View Geometry Under the {$L_\infty$}-Norm , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ira Kemelmacher-Shlizerman,et al.  Global Motion Estimation from Point Matches , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[4]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[5]  Horst Bischof,et al.  What can missing correspondences tell us about 3D structure and motion? , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Michal Havlena,et al.  Efficient Structure from Motion by Graph Optimization , 2010, ECCV.

[7]  Jan-Michael Frahm,et al.  Reconstructing the World* in Six Days *(As Captured by the Yahoo 100 Million Image Dataset) , 2015, CVPR 2015.

[8]  Stefano Soatto,et al.  ShapeFit and ShapeKick for Robust, Scalable Structure from Motion , 2016, ECCV.

[9]  Tobias Höllerer,et al.  Optimizing the Viewing Graph for Structure-from-Motion , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Jan-Michael Frahm,et al.  Detailed Real-Time Urban 3D Reconstruction from Video , 2007, International Journal of Computer Vision.

[11]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[12]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[13]  Ankita Kumar,et al.  Structure from Motion with Known Camera Positions , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[14]  Zhaopeng Cui,et al.  Linear Global Translation Estimation with Feature Tracks , 2015, BMVC.

[15]  Anders Heyden,et al.  Covariance Propagation and Next Best View Planning for 3D Reconstruction , 2012, ECCV.

[16]  Marc Pollefeys,et al.  Disambiguating visual relations using loop constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Zhanyi Hu,et al.  Global fusion of generalized camera model for efficient large-scale structure from motion , 2015, Science China Information Sciences.

[18]  Ping Tan,et al.  A Global Linear Method for Camera Pose Registration , 2013, 2013 IEEE International Conference on Computer Vision.

[19]  Onur Özyesil,et al.  Robust camera location estimation by convex programming , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Noah Snavely,et al.  Robust Global Translations with 1DSfM , 2014, ECCV.

[21]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[22]  Andrew Owens,et al.  Discrete-continuous optimization for large-scale structure from motion , 2011, CVPR 2011.

[23]  Roland Siegwart,et al.  A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation , 2011, CVPR 2011.

[24]  Santo Fortunato,et al.  Community detection in graphs , 2009, ArXiv.

[25]  Andrea Fusiello,et al.  Hierarchical structure-and-motion recovery from uncalibrated images , 2015, Comput. Vis. Image Underst..

[26]  Luc Van Gool,et al.  Drift detection and removal for sequential structure from motion algorithms , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Venu Madhav Govindu Lie-algebraic averaging for globally consistent motion estimation , 2004, CVPR 2004.

[28]  Venu Madhav Govindu,et al.  Robustness in Motion Averaging , 2006, ACCV.

[29]  James M. Rehg,et al.  Adaptive Structure from Motion with a Contrario Model Estimation , 2012, ACCV.

[30]  Tomás Pajdla,et al.  Robust Rotation and Translation Estimation in Multiview Reconstruction , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[32]  Changchang Wu,et al.  Structure from Motion Using Structure-Less Resection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Richard Szeliski,et al.  A Multi-stage Linear Approach to Structure from Motion , 2010, ECCV Workshops.

[34]  Ping Tan,et al.  Global Structure-from-Motion by Similarity Averaging , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Noah Snavely,et al.  When is Rotations Averaging Hard? , 2016, ECCV.

[36]  Venu Madhav Govindu,et al.  Efficient and Robust Large-Scale Rotation Averaging , 2013, 2013 IEEE International Conference on Computer Vision.

[37]  Michal Havlena,et al.  Randomized structure from motion based on atomic 3D models from camera triplets , 2009, CVPR.

[38]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[39]  Carsten Rother Linear multiview reconstruction of points, lines, planes and cameras using a reference plane , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[40]  P. J. Narayanan,et al.  Multistage SFM: Revisiting Incremental Structure from Motion , 2014, 2014 2nd International Conference on 3D Vision.

[41]  Richard Szeliski,et al.  Building Rome in a day , 2009, ICCV.

[42]  Pascal Monasse,et al.  Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion , 2013, ICCV.

[43]  Long Quan,et al.  Graph-Based Consistent Matching for Structure-from-Motion , 2016, ECCV.

[44]  Zhanyi Hu,et al.  Efficient Large-Scale Structure From Motion by Fusing Auxiliary Imaging Information , 2015, IEEE Transactions on Image Processing.