Quaternion Equivariant Capsule Networks for 3D Point Clouds

We present a 3D capsule module for processing point clouds that is equivariant to 3D rotations and translations, as well as invariant to permutations of the input points. The operator receives a sparse set of local reference frames, computed from an input point cloud and establishes end-to-end transformation equivariance through a novel dynamic routing procedure on quaternions. Further, we theoretically connect dynamic routing between capsules to the well-known Weiszfeld algorithm, a scheme for solving \emph{iterative re-weighted least squares} (IRLS) problems with provable convergence properties. It is shown that such group dynamic routing can be interpreted as robust IRLS rotation averaging on capsule votes, where information is routed based on the final inlier scores. Based on our operator, we build a capsule network that disentangles geometry from pose, paving the way for more informative descriptors and a structured latent space. Our architecture allows joint object classification and orientation estimation without explicit supervision of rotations. We validate our algorithm empirically on common benchmark datasets.

[1]  Maks Ovsjanikov,et al.  Multi-directional geodesic neural networks via equivariant convolution , 2018, ACM Trans. Graph..

[2]  Paul J. Besl,et al.  Method for registration of 3-D shapes , 1992, Other Conferences.

[3]  Colin Giles,et al.  Learning, invariance, and generalization in high-order neural networks. , 1987, Applied optics.

[4]  Geoffrey E. Hinton,et al.  Matrix capsules with EM routing , 2018, ICLR.

[5]  Klaus-Robert Müller,et al.  SchNet: A continuous-filter convolutional neural network for modeling quantum interactions , 2017, NIPS.

[6]  Jochen Trumpf,et al.  Generalized Weiszfeld Algorithms for Lq Optimization , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Luigi di Stefano,et al.  A Repeatable and Efficient Canonical Reference for Surface Matching , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[8]  Konstantinos N. Plataniotis,et al.  Brain Tumor Type Classification via Capsule Networks , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[9]  Yue Wang,et al.  Dynamic Graph CNN for Learning on Point Clouds , 2018, ACM Trans. Graph..

[10]  Raquel Urtasun,et al.  Deep Parametric Continuous Convolutional Neural Networks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Slobodan Ilic,et al.  PPFNet: Global Context Aware Local Features for Robust 3D Point Matching , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Federico Tombari,et al.  3D Point Capsule Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yee Whye Teh,et al.  Stacked Capsule Autoencoders , 2019, NeurIPS.

[14]  Slobodan Ilic,et al.  A point sampling algorithm for 3D matching of irregular geometries , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Federico Tombari,et al.  Unique Signatures of Histograms for Local Surface Description , 2010, ECCV.

[16]  Max Welling,et al.  Spherical CNNs , 2018, ICLR.

[17]  Jean-Philippe Thiran,et al.  Scale Invariant Feature Transform on the Sphere: Theory and Applications , 2012, International Journal of Computer Vision.

[18]  Nikos Komodakis,et al.  Rotation Equivariant Vector Field Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[19]  Kostas Daniilidis,et al.  Equivariant Multi-View Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[20]  Jochen Trumpf,et al.  $${L_q}$$Lq-Closest-Point to Affine Subspaces Using the Generalized Weiszfeld Algorithm , 2015, International Journal of Computer Vision.

[21]  Leonidas J. Guibas,et al.  Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  J. Magnus On Differentiating Eigenvalues and Eigenvectors , 1985, Econometric Theory.

[23]  Karthik Ramani,et al.  Deep Learning 3D Shapes Using Alt-az Anisotropic 2-Sphere Convolution , 2018, ICLR.

[24]  Maurice Weiler,et al.  A General Theory of Equivariant CNNs on Homogeneous Spaces , 2018, NeurIPS.

[25]  N. Steenrod The Topology of Fibre Bundles. (PMS-14) , 1951 .

[26]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[27]  Pascal Libuschewski,et al.  Group Equivariant Capsule Networks , 2018, NeurIPS.

[28]  Federico Tombari,et al.  GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Sören Laue,et al.  Computing Higher Order Derivatives of Matrix and Tensor Expressions , 2018, NeurIPS.

[30]  Leonidas J. Guibas,et al.  Volumetric and Multi-view CNNs for Object Classification on 3D Data , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Wouter Boomsma,et al.  Spherical convolutions and their application in molecular modelling , 2017, NIPS.

[32]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Luigi di Stefano,et al.  On the repeatability of the local reference frame for partial shape matching , 2011, 2011 International Conference on Computer Vision.

[34]  Y. Oshman,et al.  Averaging Quaternions , 2007 .

[35]  Dong Tian,et al.  Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Sebastian Scherer,et al.  VoxNet: A 3D Convolutional Neural Network for real-time object recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[38]  Max Welling,et al.  Steerable CNNs , 2016, ICLR.

[39]  Luigi Di Stefano,et al.  Learning an Effective Equivariant 3D Descriptor Without Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[40]  Zhen Lin,et al.  Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network , 2018, NeurIPS.

[41]  Slobodan Ilic,et al.  PPF-FoldNet: Unsupervised Learning of Rotation Invariant 3D Local Descriptors , 2018, ECCV.

[42]  Vladlen Koltun,et al.  Learning Compact Geometric Features , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[43]  Heinrich Müller,et al.  SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[44]  C. Burrus Iterative Reweighted Least Squares ∗ , 2014 .

[45]  Lihui Chen,et al.  Capsule Graph Neural Network , 2018, ICLR.

[46]  Max Welling,et al.  Gauge Equivariant Convolutional Networks and the Icosahedral CNN 1 , 2019 .

[47]  Gabriel J. Brostow,et al.  CubeNet: Equivariance to 3D Rotation and Translation , 2018, ECCV.

[48]  Slobodan Ilic,et al.  Point Pair Features Based Object Detection and Pose Estimation Revisited , 2015, 2015 International Conference on 3D Vision.

[49]  Kostas Daniilidis,et al.  Learning SO(3) Equivariant Representations with Spherical CNNs , 2017, International Journal of Computer Vision.

[50]  Ian D. Reid,et al.  DeepSetNet: Predicting Sets with Deep Neural Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[51]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[52]  Yi Xu,et al.  Quaternion Product Units for Deep Learning on 3D Rotation Groups , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Cees Snoek,et al.  Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on N-Spheres , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Matthieu Cord,et al.  Manifold Learning in Quotient Spaces , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Stephan J. Garbin,et al.  Harmonic Networks: Deep Translation and Rotation Equivariance , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Yasuhiro Aoki,et al.  PointNetLK: Robust & Efficient Point Cloud Registration Using PointNet , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Risi Kondor,et al.  On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups , 2018, ICML.

[58]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[59]  Cewu Lu,et al.  Pointwise Rotation-Invariant Network with Adaptive Sampling and 3D Spherical Voxel Convolution , 2020, AAAI.

[60]  Matthias Zwicker,et al.  Point2Sequence: Learning the Shape Representation of 3D Point Clouds with an Attention-based Sequence to Sequence Network , 2018, AAAI.

[61]  Premkumar Natarajan,et al.  CapsuleGAN: Generative Adversarial Capsule Network , 2018, ECCV Workshops.

[62]  Gary Bécigneul,et al.  Riemannian Adaptive Optimization Methods , 2018, ICLR.

[63]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[64]  Martial Hebert,et al.  Iterative Transformer Network for 3D Point Cloud , 2018, ArXiv.

[65]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[66]  Nassir Navab,et al.  Camera Pose Filtering with Local Regression Geodesics on the Riemannian Manifold of Dual Quaternions , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[67]  Maurice Weiler,et al.  Learning Steerable Filters for Rotation Equivariant CNNs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[68]  Li Li,et al.  Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds , 2018, ArXiv.

[69]  Phillipp Kaestner,et al.  Linear And Nonlinear Programming , 2016 .

[70]  Leonidas J. Guibas,et al.  Synchronizing Probability Measures on Rotations via Optimal Transport , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Richard I. Hartley,et al.  Convergence of Iteratively Re-weighted Least Squares to Robust M-Estimators , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[72]  Slobodan Ilic,et al.  Bayesian Pose Graph Optimization via Bingham Distributions and Tempered Geodesic MCMC , 2018, NeurIPS.

[73]  Lizhuang Ma,et al.  PRIN: Pointwise Rotation-Invariant Network , 2018, ArXiv.

[74]  Kostas Daniilidis,et al.  Equivariant MultiView Networks , 2019 .

[75]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[76]  Nitish Srivastava,et al.  Geometric Capsule Autoencoders for 3D Point Clouds , 2019, ArXiv.

[77]  Linqi Song,et al.  Equivariant neural networks and equivarification , 2019, ArXiv.

[78]  Qiang Liu,et al.  An Optimization View on Dynamic Routing Between Capsules , 2018, ICLR.

[79]  N. Steenrod Topology of Fibre Bundles , 1951 .

[80]  Rudrasis Chakraborty,et al.  H-CNNs: Convolutional Neural Networks for Riemannian Homogeneous Spaces , 2018, ArXiv.

[81]  Max Welling,et al.  3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data , 2018, NeurIPS.

[82]  Tony DeRose,et al.  Surface reconstruction from unorganized points , 1992, SIGGRAPH.

[83]  Geoffrey E. Hinton,et al.  Transforming Auto-Encoders , 2011, ICANN.

[84]  Kostas Daniilidis,et al.  Cross-Domain 3D Equivariant Image Embeddings , 2018, ICML.

[85]  Matthias Nießner,et al.  Spherical CNNs on Unstructured Grids , 2019, ICLR.

[86]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[87]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.