论文信息 - Error bounds for deep ReLU networks using the Kolmogorov-Arnold superposition theorem

Error bounds for deep ReLU networks using the Kolmogorov-Arnold superposition theorem

We prove a theorem concerning the approximation of multivariate functions by deep ReLU networks, for which the curse of the dimensionality is lessened. Our theorem is based on a constructive proof of the Kolmogorov-Arnold superposition theorem, and on a subset of multivariate continuous functions whose outer superposition functions can be efficiently approximated by deep ReLU networks.

Haizhao Yang | Hadrien Montanelli | Haizhao Yang | Hadrien Montanelli

[1] Zuowei Shen,et al. Deep Network Approximation Characterized by Number of Neurons , 2019, Communications in Computational Physics.

[2] Vera Kurková,et al. Kolmogorov's theorem and multilayer neural networks , 1992, Neural Networks.

[3] Abbas Mehrabian,et al. Nearly-tight VC-dimension bounds for piecewise linear neural networks , 2017, COLT.

[4] Mario Köppen,et al. On the Training of a Kolmogorov Network , 2002, ICANN.

[5] David A. Sprecher,et al. A Numerical Implementation of Kolmogorov's Superpositions , 1996, Neural Networks.

[6] Philipp Petersen,et al. Optimal approximation of piecewise smooth functions using deep ReLU neural networks , 2017, Neural Networks.

[7] Haizhao Yang,et al. Deep ReLU networks overcome the curse of dimensionality for bandlimited functions , 2019, 1903.00735.

[8] David A. Sprecher,et al. A Numerical Implementation of Kolmogorov's Superpositions II , 1996, Neural Networks.

[9] Nico M. Temme,et al. Numerical methods for special functions , 2007 .

[10] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[11] R. DeVore,et al. Optimal nonlinear approximation , 1989 .

[12] Qiang Du,et al. New Error Bounds for Deep ReLU Networks Using Sparse Grids , 2017, SIAM J. Math. Data Sci..

[13] V. Tikhomirov. On the Representation of Continuous Functions of Several Variables as Superpositions of Continuous Functions of one Variable and Addition , 1991 .

[14] Vra Krkov. Kolmogorov's Theorem Is Relevant , 1991, Neural Computation.

[15] Peter L. Bartlett,et al. Neural Network Learning - Theoretical Foundations , 1999 .

[16] Boris Igelnik,et al. Kolmogorov's spline network , 2003, IEEE Trans. Neural Networks.

[17] Alexander Cloninger,et al. Provable approximation properties for deep neural networks , 2015, ArXiv.

[18] Nadav Cohen,et al. On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[19] D. Sprecher. On the structure of continuous functions of several variables , 1965 .

[20] Lorenzo Rosasco,et al. Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review , 2016, International Journal of Automation and Computing.

[21] Michael Griebel,et al. On a Constructive Proof of Kolmogorov’s Superposition Theorem , 2009 .

[22] V. Tikhomirov. On the Representation of Continuous Functions of Several Variables as Superpositions of Continuous Functions of a Smaller Number of Variables , 1991 .

[23] Matus Telgarsky,et al. Benefits of Depth in Neural Networks , 2016, COLT.

[24] Dmitry Yarotsky,et al. Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.

[25] David A. Sprecher,et al. An improvement in the superposition theorem of Kolmogorov , 1972 .

[26] Francis R. Bach,et al. Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..

[27] Matthew G. Knepley,et al. An Algorithm for Computing Lipschitz Inner Functions in Kolmogorov's Superposition Theorem , 2017, ArXiv.

[28] Dmitry Yarotsky,et al. Optimal approximation of continuous functions by very deep ReLU networks , 2018, COLT.

[29] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[30] G. Lorentz. METRIC ENTROPY, WIDTHS, AND SUPERPOSITIONS OF FUNCTIONS , 1962 .