Eigendecomposition-free Training of Deep Networks with Zero Eigenvalue-based Losses

Many classical Computer Vision problems, such as essential matrix computation and pose estimation from 3D to 2D correspondences, can be solved by finding the eigenvector corresponding to the smallest, or zero, eigenvalue of a matrix representing a linear system. Incorporating this in deep learning frameworks would allow us to explicitly encode known notions of geometry, instead of having the network implicitly learn them from data. However, performing eigendecomposition within a network requires the ability to differentiate this operation. Unfortunately, while theoretically doable, this introduces numerical instability in the optimization process in practice. In this paper, we introduce an eigendecomposition-free approach to training a deep network whose loss depends on the eigenvector corresponding to a zero eigenvalue of a matrix predicted by the network. We demonstrate on several tasks, including keypoint matching and 3D pose estimation, that our approach is much more robust than explicit differentiation of the eigendecomposition, It has better convergence properties and yields state-of-the-art results on both tasks.

[1]  Andrea Fusiello,et al.  Solving the PnP Problem with Anisotropic Orthogonal Procrustes Analysis , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[2]  Andrew Zisserman,et al.  MLESAC: A New Robust Estimator with Application to Estimating Image Geometry , 2000, Comput. Vis. Image Underst..

[3]  P. Schönemann,et al.  A generalized solution of the orthogonal procrustes problem , 1966 .

[4]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[5]  Jitendra Malik,et al.  Generic 3D Representation via Pose Estimation and Matching , 2016, ECCV.

[6]  Cristian Sminchisescu,et al.  Matrix Backpropagation for Deep Networks with Structured Layers , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[7]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[8]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Francesc Moreno-Noguer,et al.  Very Fast Solution to the PnP Problem with Algebraic Outlier Rejection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[11]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Viorica Patraucean,et al.  gvnn: Neural Network Library for Geometric Computer Vision , 2016, ECCV Workshops.

[14]  Pascal Fua,et al.  On benchmarking camera calibration and multi-view stereo for high resolution imagery , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Yubin Kuang,et al.  Revisiting the PnP Problem: A Fast, General and Optimal Solution , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Thomas Brox,et al.  DeMoN: Depth and Motion Network for Learning Monocular Stereo , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Manolis I. A. Lourakis,et al.  Estimating the Jacobian of the Singular Value Decomposition: Theory and Applications , 2000, ECCV.

[18]  Yasuyuki Matsushita,et al.  GMS: Grid-Based Motion Statistics for Fast, Ultra-robust Feature Correspondence , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[20]  Zhengyou Zhang,et al.  Determining the Epipolar Geometry and its Uncertainty: A Review , 1998, International Journal of Computer Vision.

[21]  Jan-Michael Frahm,et al.  USAC: A Universal Framework for Random Sample Consensus , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[23]  Roland Siegwart,et al.  A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation , 2011, CVPR 2011.

[24]  M. Giles Collected Matrix Derivative Results for Forward and Reverse Mode Algorithmic Differentiation , 2008 .

[25]  Peter J. Rousseeuw,et al.  Robust Regression and Outlier Detection , 2005, Wiley Series in Probability and Statistics.

[26]  Richard I. Hartley,et al.  In Defense of the Eight-Point Algorithm , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Luc Van Gool,et al.  Deep Learning on Lie Groups for Skeleton-Based Action Recognition , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Iain Murray,et al.  Differentiation of the Cholesky decomposition , 2016, ArXiv.

[29]  H. C. Longuet-Higgins,et al.  A computer algorithm for reconstructing a scene from two projections , 1981, Nature.

[30]  Shiqi Li,et al.  A Robust O(n) Solution to the Perspective-n-Point Problem , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Luc Van Gool,et al.  A Riemannian Network for SPD Matrix Learning , 2016, AAAI.

[32]  Raquel Urtasun,et al.  Deep Spectral Clustering Learning , 2017, ICML.

[33]  Eric Brachmann,et al.  DSAC — Differentiable RANSAC for Camera Localization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Hilde Nybom,et al.  Introduction to Rosa , 2009 .

[35]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[36]  Vincent Lepetit,et al.  Robust 3D Object Tracking from Monocular Images Using Stable Parts , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  H. Cantzler Random Sample Consensus ( RANSAC ) , 2022 .

[38]  Jan-Michael Frahm,et al.  Reconstructing the world* in six days , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Vincent Lepetit,et al.  Learning to Find Good Correspondences , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  D. G. Simpson,et al.  Introduction to Rousseeuw (1984) Least Median of Squares Regression , 1997 .

[41]  Changchang Wu,et al.  Towards Linear-Time Incremental Structure from Motion , 2013, 2013 International Conference on 3D Vision.

[42]  A FischlerMartin,et al.  Random sample consensus , 1981 .

[43]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..