Error bounds for deep ReLU networks using the Kolmogorov-Arnold superposition theorem

We prove a theorem concerning the approximation of multivariate functions by deep ReLU networks, for which the curse of the dimensionality is lessened. Our theorem is based on a constructive proof of the Kolmogorov-Arnold superposition theorem, and on a subset of multivariate continuous functions whose outer superposition functions can be efficiently approximated by deep ReLU networks.

[1]  Zuowei Shen,et al.  Deep Network Approximation Characterized by Number of Neurons , 2019, Communications in Computational Physics.

[2]  Vera Kurková,et al.  Kolmogorov's theorem and multilayer neural networks , 1992, Neural Networks.

[3]  Abbas Mehrabian,et al.  Nearly-tight VC-dimension bounds for piecewise linear neural networks , 2017, COLT.

[4]  Mario Köppen,et al.  On the Training of a Kolmogorov Network , 2002, ICANN.

[5]  David A. Sprecher,et al.  A Numerical Implementation of Kolmogorov's Superpositions , 1996, Neural Networks.

[6]  Philipp Petersen,et al.  Optimal approximation of piecewise smooth functions using deep ReLU neural networks , 2017, Neural Networks.

[7]  Haizhao Yang,et al.  Deep ReLU networks overcome the curse of dimensionality for bandlimited functions , 2019, 1903.00735.

[8]  David A. Sprecher,et al.  A Numerical Implementation of Kolmogorov's Superpositions II , 1996, Neural Networks.

[9]  Nico M. Temme,et al.  Numerical methods for special functions , 2007 .

[10]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[11]  R. DeVore,et al.  Optimal nonlinear approximation , 1989 .

[12]  Qiang Du,et al.  New Error Bounds for Deep ReLU Networks Using Sparse Grids , 2017, SIAM J. Math. Data Sci..

[13]  V. Tikhomirov On the Representation of Continuous Functions of Several Variables as Superpositions of Continuous Functions of one Variable and Addition , 1991 .

[14]  Vra Krkov Kolmogorov's Theorem Is Relevant , 1991, Neural Computation.

[15]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[16]  Boris Igelnik,et al.  Kolmogorov's spline network , 2003, IEEE Trans. Neural Networks.

[17]  Alexander Cloninger,et al.  Provable approximation properties for deep neural networks , 2015, ArXiv.

[18]  Nadav Cohen,et al.  On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[19]  D. Sprecher On the structure of continuous functions of several variables , 1965 .

[20]  Lorenzo Rosasco,et al.  Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.

[21]  Michael Griebel,et al.  On a Constructive Proof of Kolmogorov’s Superposition Theorem , 2009 .

[22]  V. Tikhomirov On the Representation of Continuous Functions of Several Variables as Superpositions of Continuous Functions of a Smaller Number of Variables , 1991 .

[23]  Matus Telgarsky,et al.  Benefits of Depth in Neural Networks , 2016, COLT.

[24]  Dmitry Yarotsky,et al.  Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.

[25]  David A. Sprecher,et al.  An improvement in the superposition theorem of Kolmogorov , 1972 .

[26]  Francis R. Bach,et al.  Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..

[27]  Matthew G. Knepley,et al.  An Algorithm for Computing Lipschitz Inner Functions in Kolmogorov's Superposition Theorem , 2017, ArXiv.

[28]  Dmitry Yarotsky,et al.  Optimal approximation of continuous functions by very deep ReLU networks , 2018, COLT.

[29]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[30]  G. Lorentz METRIC ENTROPY, WIDTHS, AND SUPERPOSITIONS OF FUNCTIONS , 1962 .