Improved Generalization Bound of Permutation Invariant Deep Neural Networks

We theoretically prove that a permutation invariant property of deep neural networks largely improves its generalization performance. Learning problems with data that are invariant to permutations are frequently observed in various applications, for example, point cloud data and graph neural networks. Numerous methodologies have been developed and they achieve great performances, however, understanding a mechanism of the performance is still a developing problem. In this paper, we derive a theoretical generalization bound for invariant deep neural networks with a ReLU activation to clarify their mechanism. Consequently, our bound shows that the main term of their generalization gap is improved by $\sqrt{n!}$ where $n$ is a number of permuting coordinates of data. Moreover, we prove that an approximation power of invariant deep neural networks can achieve an optimal rate, though the networks are restricted to be invariant. To achieve the results, we develop several new proof techniques such as correspondence with a fundamental domain and a scale-sensitive metric entropy.

[1]  Dmitry Yarotsky,et al.  Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.

[2]  Alexander J. Smola,et al.  Deep Sets , 2017, 1703.06114.

[3]  Felix A Faber,et al.  Machine Learning Energies of 2 Million Elpasolite (ABC_{2}D_{6}) Crystals. , 2015, Physical review letters.

[4]  Jiaxin Li,et al.  SO-Net: Self-Organizing Network for Point Cloud Analysis , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Akiyoshi Sannai,et al.  Universal approximations of permutation invariant/equivariant functions by deep neural networks , 2019, ArXiv.

[6]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[7]  Johannes Schmidt-Hieber,et al.  Nonparametric regression using deep neural networks with ReLU activation function , 2017, The Annals of Statistics.

[8]  Wei Wu,et al.  PointCNN: Convolution On X-Transformed Points , 2018, NeurIPS.

[9]  Yifan Xu,et al.  SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters , 2018, ECCV.

[10]  Subhransu Maji,et al.  SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Yaron Lipman,et al.  On the Universality of Invariant Networks , 2019, ICML.

[12]  Dong Tian,et al.  FoldingNet: Point Cloud Auto-Encoder via Deep Grid Deformation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Matus Telgarsky,et al.  Spectrally-normalized margin bounds for neural networks , 2017, NIPS.

[14]  Barnabás Póczos,et al.  Estimating Cosmological Parameters from the Dark Matter Distribution , 2016, ICML.

[15]  Danica J. Sutherland,et al.  DYNAMICAL MASS MEASUREMENTS OF CONTAMINATED GALAXY CLUSTERS USING MACHINE LEARNING , 2015, 1509.05409.