The phase diagram of approximation rates for deep neural networks

We explore the phase diagram of approximation rates for deep neural networks. The phase diagram describes theoretically optimal accuracy-complexity relations and their qualitative properties. Our contribution is three-fold. First, we generalize the existing result on the existence of deep discontinuous phase in ReLU networks to functional classes of arbitrary positive smoothness, and identify the boundary between the feasible and infeasible rates. Second, we demonstrate that standard fully-connected architectures of a fixed width independent of smoothness can adapt to smoothness and achieve almost optimal rates. Finally, we discuss how the phase diagram can change in the case of non-ReLU activation functions. In particular, we prove that using both sine and ReLU activations theoretically leads to very fast, nearly exponential approximation rates, thanks to the emerging capability of the network to implement efficient lookup operations.

[1]  D. Jackson,et al.  The theory of approximation , 1982 .

[2]  Dmitry Yarotsky,et al.  Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.

[3]  Marek Karpinski,et al.  Polynomial Bounds for VC Dimension of Sigmoidal and General Pfaffian Neural Networks , 1997, J. Comput. Syst. Sci..

[4]  Peter L. Bartlett,et al.  Nearly-tight VC-dimension and Pseudodimension Bounds for Piecewise Linear Neural Networks , 2017, J. Mach. Learn. Res..

[5]  Mark Sellke,et al.  Approximating Continuous Functions by ReLU Nets of Minimal Width , 2017, ArXiv.

[6]  Ohad Shamir,et al.  Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks , 2016, ICML.

[7]  Allan Pinkus,et al.  Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[8]  Liwei Wang,et al.  The Expressive Power of Neural Networks: A View from the Width , 2017, NIPS.

[9]  Peter L. Bartlett,et al.  Almost Linear VC-Dimension Bounds for Piecewise Polynomial Networks , 1998, Neural Computation.

[10]  Philipp Petersen,et al.  Optimal approximation of piecewise smooth functions using deep ReLU neural networks , 2017, Neural Networks.

[11]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[12]  Helmut Bölcskei,et al.  Memory-optimal neural network approximation , 2017, Optical Engineering + Applications.

[13]  Paul W. Goldberg,et al.  Bounding the Vapnik-Chervonenkis Dimension of Concept Classes Parameterized by Real Numbers , 1993, COLT '93.

[14]  Dmitry Yarotsky,et al.  Optimal approximation of continuous functions by very deep ReLU networks , 2018, COLT.

[15]  R. Srikant,et al.  Why Deep Neural Networks? , 2016, ArXiv.

[16]  Stefanie Jegelka,et al.  ResNet with one-neuron hidden layers is a Universal Approximator , 2018, NeurIPS.