On the Universality of Invariant Networks

Constraining linear layers in neural networks to respect symmetry transformations from a group $G$ is a common design principle for invariant networks that has found many applications in machine learning. In this paper, we consider a fundamental question that has received little attention to date: Can these networks approximate any (continuous) invariant function? We tackle the rather general case where $G\leq S_n$ (an arbitrary subgroup of the symmetric group) that acts on $\mathbb{R}^n$ by permuting coordinates. This setting includes several recent popular invariant networks. We present two main results: First, $G$-invariant networks are universal if high-order tensors are allowed. Second, there are groups $G$ for which higher-order tensors are unavoidable for obtaining universality. $G$-invariant networks consisting of only first-order tensors are of special interest due to their practical value. We conclude the paper by proving a necessary condition for the universality of $G$-invariant networks that incorporate only first-order tensors.

[1]  Max Welling,et al.  3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data , 2018, NeurIPS.

[2]  H. Wielandt,et al.  Permutation groups through invariant relations and invariant functions , 1969 .

[3]  Yaron Lipman,et al.  Invariant and Equivariant Graph Networks , 2018, ICLR.

[4]  Max Welling,et al.  Steerable CNNs , 2016, ICLR.

[5]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[6]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[7]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jure Leskovec,et al.  How Powerful are Graph Neural Networks? , 2018, ICLR.

[11]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[12]  Risi Kondor,et al.  On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups , 2018, ICML.

[13]  Kevin Leyton-Brown,et al.  Deep Models of Interactions Across Sets , 2018, ICML.

[14]  Manfred Göbel,et al.  Computing Bases for Rings of Permutation-Invariant Polynomials , 1995, J. Symb. Comput..

[15]  Barnabás Póczos,et al.  Equivariance Through Parameter-Sharing , 2017, ICML.

[16]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[17]  Hanspeter Kraft,et al.  Classical invariant theory: a primer , 1996 .

[18]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[19]  Risi Kondor,et al.  Covariant Compositional Networks For Learning Graphs , 2018, ICLR.

[20]  Dmitry Yarotsky,et al.  Universal Approximations of Invariant Maps by Neural Networks , 2018, Constructive Approximation.