Self-Calibration Supported Robust Projective Structure-from-Motion

Typical Structure-from-Motion (SfM) pipelines rely on finding correspondences across images, recovering the projective structure of the observed scene and upgrading it to a metric frame using camera self-calibration constraints. Solving each problem is mainly carried out independently from the others. For instance, camera self-calibration generally assumes correct matches and a good projective reconstruction have been obtained. In this paper, we propose a unified SfM method, in which the matching process is supported by self-calibration constraints. We use the idea that good matches should yield a valid calibration. In this process, we make use of the Dual Image of Absolute Quadric projection equations within a multiview correspondence framework, in order to obtain robust matching from a set of putative correspondences. The matching process classifies points as inliers or outliers, which is learned in an unsupervised manner using a deep neural network. Together with theoretical reasoning why the self-calibration constraints are necessary, we show experimental results demonstrating robust multiview matching and accurate camera calibration by exploiting these constraints.

[1]  Hongdong Li,et al.  Element-Wise Factorization for N-View Projective Reconstruction , 2010, ECCV.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Alessio Del Bue,et al.  Practical Projective Structure from Motion (P2SfM) , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[4]  Anelia Angelova,et al.  Depth From Videos in the Wild: Unsupervised Monocular Depth Learning From Unknown Cameras , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5]  David Fofi,et al.  Efficient Pruning LMI Conditions for Branch-and-Prune Rank and Chirality-Constrained Estimation of the Dual Absolute Quadric , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  J J Koenderink,et al.  Affine structure from motion. , 1991, Journal of the Optical Society of America. A, Optics and image science.

[7]  Gabriel J. Brostow,et al.  Digging Into Self-Supervised Monocular Depth Estimation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[8]  Peter F. Sturm,et al.  Critical motion sequences for monocular self-calibration and uncalibrated Euclidean reconstruction , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Vladlen Koltun,et al.  Deep Fundamental Matrix Estimation , 2018, ECCV.

[10]  Fabio Morbidi,et al.  QUARCH: A New Quasi-Affine Reconstruction Stratum From Vague Relative Camera Orientation Knowledge , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[11]  O. Faugeras Stratification of three-dimensional vision: projective, affine, and metric representations , 1995 .

[12]  Jan P. Allebach,et al.  Multi-View Matching Network for 6D Pose Estimation , 2019, ArXiv.

[13]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[14]  Richard I. Hartley,et al.  Iterative Extensions of the Sturm/Triggs Algorithm: Convergence and Nonconvergence , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Andrew Zisserman,et al.  Multi-view Matching for Unordered Image Sets, or "How Do I Organize My Holiday Snaps?" , 2002, ECCV.

[16]  Loong Fah Cheong,et al.  Degeneracy in Self-Calibration Revisited and a Deep Learning Solution for Uncalibrated SLAM , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[17]  Marc Pollefeys,et al.  Multiple view geometry , 2005 .

[18]  Jan-Michael Frahm,et al.  Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Jan-Michael Frahm,et al.  Pixelwise View Selection for Unstructured Multi-View Stereo , 2016, ECCV.

[20]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Peter F. Sturm,et al.  A Factorization Based Algorithm for Multi-Image Projective Structure and Motion , 1996, ECCV.

[22]  Jean Charles Bazin,et al.  DeepCalib: a deep learning approach for automatic intrinsic calibration of wide field-of-view cameras , 2018, CVMP '18.

[23]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  O. D. Faugeras,et al.  Camera Self-Calibration: Theory and Experiments , 1992, ECCV.

[25]  Yannick Hold-Geoffroy,et al.  A Perceptual Measure for Deep Single Image Camera Calibration , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  David Nistér,et al.  Untwisting a Projective Reconstruction , 2004, International Journal of Computer Vision.

[27]  Martial Hebert,et al.  Provably-convergent iterative methods for projective structure from motion , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[28]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[30]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[31]  R. Hartley,et al.  PowerFactorization : 3D reconstruction with missing or uncertain data , 2003 .

[32]  Lúcia Valéria Ramos de Arruda,et al.  Camera Calibration Using Detection and Neural Networks , 2013 .

[33]  Eric Brachmann,et al.  DSAC — Differentiable RANSAC for Camera Localization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Thierry Viéville,et al.  Canonical Representations for the Geometries of Multiple Projective Views , 1996, Comput. Vis. Image Underst..

[35]  Calin Belta,et al.  Distributed and consistent multi-image feature matching via QuickMatch , 2019, Int. J. Robotics Res..

[36]  Andrea Fusiello,et al.  Practical Autocalibration , 2010, ECCV.

[37]  Bill Triggs,et al.  Autocalibration and the absolute quadric , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[38]  Luc Van Gool,et al.  The modulus constraint: a new constraint self-calibration , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[39]  Reinhard Koch,et al.  Self-Calibration and Metric Reconstruction Inspite of Varying and Unknown Intrinsic Camera Parameters , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[40]  Roger Y. Tsai,et al.  A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses , 1987, IEEE J. Robotics Autom..

[41]  Eric Brachmann,et al.  Neural-Guided RANSAC: Learning Where to Sample Model Hypotheses , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Andrew Zisserman,et al.  Combining scene and auto-calibration constraints , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[43]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[44]  Carl Olsson,et al.  Stable Structure from Motion for Unordered Image Collections , 2011, SCIA.

[45]  Adrien Bartoli,et al.  Is dual linear self-calibration artificially ambiguous? , 2009, IEEE International Conference on Computer Vision.

[46]  Luc Van Gool,et al.  Unsupervised Learning of Consensus Maximization for 3D Vision Problems , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Stephen J. Maybank,et al.  On plane-based camera calibration: A general algorithm, singularities, applications , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[48]  Cordelia Schmid,et al.  Self-Supervised Learning With Geometric Constraints in Monocular Video: Connecting Flow, Depth, and Camera , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[49]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.