Universal Approximations of Invariant Maps by Neural Networks

We describe generalizations of the universal approximation theorem for neural networks to maps invariant or equivariant with respect to linear representations of groups. Our goal is to establish network-like computational models that are both invariant/equivariant and provably complete in the sense of their ability to approximate any continuous invariant/equivariant map. Our contribution is three-fold. First, in the general case of compact groups we propose a construction of a complete invariant/equivariant network using an intermediate polynomial layer. We invoke classical theorems of Hilbert and Weyl to justify and simplify this construction; in particular, we describe an explicit complete ansatz for approximation of permutation-invariant maps. Second, we consider groups of translations and prove several versions of the universal approximation theorem for convolutional networks in the limit of continuous signals on euclidean spaces. Finally, we consider 2D signal transformations equivariant with respect to the group SE(2) of rigid euclidean motions. In this case we introduce the "charge--conserving convnet" -- a convnet-like computational model based on the decomposition of the feature space into isotypic representations of SO(2). We prove this model to be a universal approximator for continuous SE(2)--equivariant signal transformations.

[1]  A. Pinkus TDI-Subspaces ofC(Rd) and Some Density Problems from Neural Networks , 1996 .

[2]  Ken-ichi Funahashi,et al.  On the approximate realization of continuous mappings by neural networks , 1989, Neural Networks.

[3]  C. Micchelli,et al.  Approximation by superposition of sigmoidal and radial basis functions , 1992 .

[4]  Hanns Schulz-Mirbach,et al.  Invariant Features for Gray Scale Images , 1995, DAGM-Symposium.

[5]  Kurt Hornik,et al.  Some new results on neural network approximation , 1993, Neural Networks.

[6]  Stéphane Mallat,et al.  Understanding deep convolutional networks , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[7]  Allan Pinkus,et al.  Multilayer Feedforward Networks with a Non-Polynomial Activation Function Can Approximate Any Function , 1991, Neural Networks.

[8]  Barbara J. Schmid,et al.  Finite groups and invariant theory , 1991 .

[9]  Andrea Vedaldi,et al.  Warped Convolutions: Efficient Invariance to Spatial Transformations , 2016, ICML.

[10]  Lorenzo Rosasco,et al.  Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.

[11]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[12]  Daniel Cremers,et al.  Integral Invariants for Shape Matching , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ronen Tamari,et al.  Analysis and Design of Convolutional Networks via Hierarchical Tensor Decompositions , 2017, ArXiv.

[14]  Nikos Komodakis,et al.  Rotation Equivariant Vector Field Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Xiaowei Zhou,et al.  Polar Transformer Networks , 2017, ICLR.

[16]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[17]  Stéphane Mallat,et al.  Rigid-Motion Scattering for Texture Classification , 2014, ArXiv.

[18]  Martin Thoma,et al.  Analysis and Optimization of Convolutional Neural Network Architectures , 2017, ArXiv.

[19]  Pedro M. Domingos,et al.  Deep Symmetry Networks , 2014, NIPS.

[20]  A. Iacob,et al.  Linear Representations of Groups , 2002 .

[22]  Lorenzo Rosasco,et al.  On Invariance and Selectivity in Representation Learning , 2015, ArXiv.

[23]  Henrik Skibbe Spherical tensor algebra for biomedical image analysis = Sphärische Tensor Algebra für die Biomedizinische Bildanalyse , 2013 .

[24]  Jean-Pierre Serre,et al.  Linear representations of finite groups , 1977, Graduate texts in mathematics.

[25]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[26]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[27]  Amnon Shashua,et al.  Convolutional Rectifier Networks as Generalized Tensor Decompositions , 2016, ICML.

[28]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[29]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[30]  Hanspeter Kraft,et al.  Classical invariant theory: a primer , 1996 .

[31]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Max Welling,et al.  Group Equivariant Convolutional Networks , 2016, ICML.

[33]  Patrick A. Worfolk,et al.  Zeros of Equivariant Vector Fields: Algorithms for an Invariant Approach , 1994, J. Symb. Comput..

[34]  H. Weyl The Classical Groups , 1940 .

[35]  B. Simon Representations of finite and compact groups , 1995 .

[36]  Yann LeCun,et al.  Generalization and network design strategies , 1989 .

[37]  Allan Pinkus,et al.  Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[38]  Marco Reisert,et al.  Group integration techniques in pattern analysis - a kernel view , 2008, Ausgezeichnete Informatikdissertationen.

[39]  Koray Kavukcuoglu,et al.  Exploiting Cyclic Symmetry in Convolutional Neural Networks , 2016, ICML.

[40]  S. Mallat,et al.  Invariant Scattering Convolution Networks , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.