Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold

Single image pose estimation is a fundamental problem in many vision and robotics tasks, and existing deep learning approaches suffer by not completely modeling and handling: i) uncertainty about the predictions, and ii) symmetric objects with multiple (sometimes infinite) correct poses. To this end, we introduce a method to estimate arbitrary, non-parametric distributions on SO(3). Our key idea is to represent the distributions implicitly, with a neural network that estimates the probability given the input image and a candidate pose. Grid sampling or gradient ascent can be used to find the most likely pose, but it is also possible to evaluate the probability at any pose, enabling reasoning about symmetries and uncertainty. This is the most general way of representing distributions on manifolds, and to showcase the rich expressive power, we introduce a dataset of challenging symmetric and nearly-symmetric objects. We require no supervision on pose uncertainty -- the model trains only with a single pose per example. Nonetheless, our implicit model is highly expressive to handle complex distributions over 3D poses, while still obtaining accurate pose estimation on standard non-ambiguous environments, achieving state-of-the-art performance on Pascal3D+ and ModelNet10-SO(3) benchmarks.

[1]  Nassir Navab,et al.  Deep Bingham Networks: Dealing with Uncertainty and Ambiguity in Pose Estimation , 2020, International Journal of Computer Vision.

[2]  Brian Okorn,et al.  Learning Orientation Distributions for Object Pose Estimation , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[3]  Noah Snavely,et al.  An Analysis of SVD for Deep Rotation Estimation , 2020, NeurIPS.

[4]  G. Bianchi,et al.  Probabilistic orientation estimation with matrix Fisher distributions , 2020, NeurIPS.

[5]  Igor Gilitschenski,et al.  Deep Orientation Uncertainty Learning based on a Bingham Loss , 2020, ICLR.

[6]  Pratul P. Srinivasan,et al.  NeRF , 2020, ECCV.

[7]  Gurtej Kanwar,et al.  Normalizing Flows on Tori and Spheres , 2020, ICML.

[8]  Giorgia Pitteri,et al.  On Object Symmetries and 6D Pose Estimation from Images , 2019, 2019 International Conference on 3D Vision (3DV).

[9]  Timothy Bretl,et al.  PoseRBPF: A Rao-Blackwellized Particle Filter for6D Object Pose Estimation , 2019, Robotics: Science and Systems.

[10]  Gordon Wetzstein,et al.  Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations , 2019, NeurIPS.

[11]  Cees Snoek,et al.  Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on N-Spheres , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  UvA-DARE (Digital Academic Repository) Reparameterizing Distributions on Lie Groups Reparameterizing Distributions on Lie Groups , 2022 .

[13]  Richard A. Newcombe,et al.  DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Sebastian Nowozin,et al.  Occupancy Networks: Learning 3D Reconstruction in Function Space , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Kostas Daniilidis,et al.  Cross-Domain 3D Equivariant Image Embeddings , 2018, ICML.

[16]  Nassir Navab,et al.  Explaining the Ambiguity of Object Detection and 6D Pose From Visual Data , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[17]  Sanja Fidler,et al.  Pose Estimation for Objects with Rotational Symmetry , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  Zoltan-Csaba Marton,et al.  Implicit 3D Orientation Learning for 6D Object Detection from RGB Images , 2018, ECCV.

[19]  Jonathan Tompson,et al.  Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning , 2018, NeurIPS.

[20]  Sebastian Nowozin,et al.  Deep Directional Statistics: Pose Estimation with Uncertainty Quantification , 2018, ECCV.

[21]  René Vidal,et al.  A Mixed Classification-Regression Framework for 3D Pose Estimation from 2D Images , 2018, BMVC.

[22]  Roberto Cipolla,et al.  Concrete Problems for Autonomous Vehicle Safety: Advantages of Bayesian Deep Learning , 2017, IJCAI.

[23]  Anne E Carpenter,et al.  Opportunities and obstacles for deep learning in biology and medicine , 2017, bioRxiv.

[24]  Manolis I. A. Lourakis,et al.  T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-Less Objects , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[25]  Siegfried Wahl,et al.  Leveraging uncertainty information from deep neural networks for disease detection , 2016, Scientific Reports.

[26]  Leonidas J. Guibas,et al.  ObjectNet3D: A Large Scale Database for 3D Object Recognition , 2016, ECCV.

[27]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  R. Cipolla,et al.  Modelling uncertainty in deep learning for camera relocalization , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[29]  Zoubin Ghahramani,et al.  Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , 2015, ICML.

[30]  Shakir Mohamed,et al.  Variational Inference with Normalizing Flows , 2015, ICML.

[31]  Leonidas J. Guibas,et al.  Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[32]  Jitendra Malik,et al.  Viewpoints and keypoints , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[34]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Silvio Savarese,et al.  Beyond PASCAL: A benchmark for 3D object detection in the wild , 2014, IEEE Winter Conference on Applications of Computer Vision.

[36]  S. R. Jammalamadaka,et al.  Directional Statistics, I , 2011 .

[37]  Steven M. LaValle,et al.  Generating Uniform Incremental Grids on SO(3) Using the Hopf Fibration , 2010, WAFR.

[38]  Ashutosh Saxena,et al.  Learning 3-D object orientation from images , 2009, 2009 IEEE International Conference on Robotics and Automation.

[39]  Taeyoung Lee,et al.  Global symplectic uncertainty propagation on SO(3) , 2008, 2008 47th IEEE Conference on Decision and Control.

[40]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[41]  Simon Li,et al.  Uncertainties in real‐time flood forecasting with neural networks , 2007 .

[42]  Allen Y. Yang,et al.  On Symmetry and Multiple-View Geometry: Structure, Pose, and Calibration from a Single Image , 2004, International Journal of Computer Vision.

[43]  K. Gorski,et al.  HEALPix: A Framework for High-Resolution Discretization and Fast Analysis of Data Distributed on the Sphere , 2004, astro-ph/0409513.

[44]  David A. Forsyth,et al.  Extracting projective structure from single perspective views of 3D point sets , 1993, 1993 (4th) International Conference on Computer Vision.

[45]  T. Poggio,et al.  Recognition and Structure from one 2D Model View: Observations on Prototypes, Object Classes and Symmetries , 1992 .

[46]  Jinsung Yoon,et al.  GENERATIVE ADVERSARIAL NETS , 2018 .

[47]  A. Khosla,et al.  A Deep Representation for Volumetric Shapes , 2015 .