论文信息 - Unsupervised Learning of Consensus Maximization for 3D Vision Problems

Unsupervised Learning of Consensus Maximization for 3D Vision Problems

Consensus maximization is a key strategy in 3D vision for robust geometric model estimation from measurements with outliers. Generic methods for consensus maximization, such as Random Sampling and Consensus (RANSAC), have played a tremendous role in the success of 3D vision, in spite of the ubiquity of outliers. However, replicating the same generic behaviour in a deeply learned architecture, using supervised approaches, has proven to be difficult. In that context, unsupervised methods have a huge potential to adapt to any unseen data distribution, and therefore are highly desirable. In this paper, we propose for the first time an unsupervised learning framework for consensus maximization, in the context of solving 3D vision problems. For that purpose, we establish a relationship between inlier measurements, represented by an ideal of inlier set, and the subspace of polynomials representing the space of target transformations. Using this relationship, we derive a constraint that must be satisfied by the sought inlier set. This constraint can be tested without knowing the transformation parameters, therefore allows us to efficiently define the geometric model fitting cost. This model fitting cost is used as a supervisory signal for learning consensus maximization, where the learning process seeks for the largest measurement set that minimizes the proposed model fitting cost. Using our method, we solve a diverse set of 3D vision problems, including 3D-3D matching, non-rigid 3D shape matching with piece-wise rigidity and image-to-image matching. Despite being unsupervised, our method outperforms RANSAC in all three tasks for several datasets.

[1] Pascal Vasseur,et al. A Branch-and-Bound Approach to Correspondence and Grouping Problems , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Jianxiong Xiao,et al. 3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Å. Björck,et al. Solution of Vandermonde Systems of Equations , 1970 .

[4] Michael J. Black,et al. SMPL: A Skinned Multi-Person Linear Model , 2023 .

[5] Dieter Fox,et al. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Michael J. Black,et al. FAUST: Dataset and Evaluation for 3D Mesh Registration , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[8] Kiriakos N. Kutulakos,et al. Non-rigid structure from locally-rigid motion , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9] Jiaxin Li,et al. SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10] Rui Yu,et al. Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes , 2014, ECCV.

[11] Bernd Sturmfels,et al. Learning algebraic varieties from samples , 2018, Revista Matemática Complutense.

[12] Richard I. Hartley,et al. Global Optimization through Rotation Space Search , 2009, International Journal of Computer Vision.

[13] Hongdong Li,et al. UPnP: An Optimal O(n) Solution to the Absolute Pose Problem with Universal Applicability , 2014, ECCV.

[14] Richard Szeliski,et al. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15] Daniel Cremers,et al. Efficient Deformable Shape Correspondence via Kernel Matching , 2017, 2017 International Conference on 3D Vision (3DV).

[16] Olivier D. Faugeras,et al. What can be seen in three dimensions with an uncalibrated stereo rig , 1992, ECCV.

[17] Andrew Zisserman,et al. MLESAC: A New Robust Estimator with Application to Estimating Image Geometry , 2000, Comput. Vis. Image Underst..

[18] Jan-Michael Frahm,et al. USAC: A Universal Framework for Random Sample Consensus , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19] Leonidas J. Guibas,et al. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21] Marc Pollefeys,et al. Globally Optimal Inlier Set Maximization with Unknown Rotation and Focal Length , 2014, ECCV.

[22] Hongdong Li,et al. Consensus set maximization with guaranteed global optimality for robust geometry estimation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23] Serge J. Belongie,et al. Deep Fundamental Matrix Estimation without Correspondences , 2018, ECCV Workshops.

[24] Yue Wang,et al. Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[25] Nico Blodow,et al. Fast Point Feature Histograms (FPFH) for 3D registration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[26] Anders P. Eriksson,et al. Guaranteed Outlier Removal with Mixed Integer Linear Programs , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Peter V. Gehler,et al. Keep It SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image , 2016, ECCV.

[28] P. Rousseeuw. Least Median of Squares Regression , 1984 .

[29] Yuan Yu,et al. TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[30] Hongdong Li,et al. Monocular Dense 3D Reconstruction of a Complex Dynamic Scene from Two Perspective Frames , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31] Joel A. Hesch,et al. A Direct Least-Squares (DLS) method for PnP , 2011, 2011 International Conference on Computer Vision.

[32] Luc Van Gool,et al. Model-free Consensus Maximization for Non-Rigid Shapes , 2018, ECCV.

[33] Alexander M. Bronstein,et al. Deep Functional Maps: Structured Prediction for Dense Shape Correspondence , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34] Vijay Kumar,et al. Unsupervised Deep Homography: A Fast and Robust Homography Estimation Model , 2017, IEEE Robotics and Automation Letters.

[35] Vincent Lepetit,et al. Learning to Find Good Correspondences , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36] Anders P. Eriksson,et al. Efficient Globally Optimal Consensus Maximisation with Tree Search , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37] Jan-Michael Frahm,et al. Structure-from-Motion Revisited , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Pascal Vasseur,et al. Robust and Optimal Registration of Image Sets and Structured Scenes via Sum-of-Squares Polynomials , 2018, International Journal of Computer Vision.

[39] David Nistér,et al. An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[41] Edmond Boyer,et al. FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42] Vladlen Koltun,et al. Deep Fundamental Matrix Estimation , 2018, ECCV.

[43] Xuelong Li,et al. Fast and Accurate Matrix Completion via Truncated Nuclear Norm Regularization , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44] Luc Van Gool,et al. Consensus Maximization with Linear Matrix Inequality Constraints , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45] H. C. Longuet-Higgins,et al. A computer algorithm for reconstructing a scene from two projections , 1981, Nature.

[46] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[47] Masatoshi Okutomi,et al. Deterministically maximizing feasible subsystem for robust model fitting with unit norm constraint , 2011, CVPR 2011.

[48] Eric Brachmann,et al. DSAC — Differentiable RANSAC for Camera Localization , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).