Error bounds for approximations with deep ReLU neural networks in $W^{s, p}$ norms

We analyze approximation rates of deep ReLU neural networks for Sobolev-regular functions with respect to weaker Sobolev norms. First, we construct, based on a calculus of ReLU networks, artificial neural networks with ReLU activation functions that achieve certain approximation rates. Second, we establish lower bounds for the approximation by ReLU neural networks for classes of Sobolev-regular functions. Our results extend recent advances in the approximation theory of ReLU networks to the regime that is most relevant for applications in the numerical analysis of partial differential equations.

[1]  G. Burton Sobolev Spaces , 2013 .

[2]  L. R. Scott,et al.  The Mathematical Theory of Finite Element Methods , 1994 .

[3]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[4]  Erik Cambria,et al.  Recent Trends in Deep Learning Based Natural Language Processing , 2017, IEEE Comput. Intell. Mag..

[5]  Christoph Schwab,et al.  Deep ReLU networks and high-order finite element methods , 2020, Analysis and Applications.

[6]  J. Heinonen Lectures on Lipschitz analysis , 2005 .

[7]  Christoph Schwab,et al.  Deep learning in high dimension: Neural network expression rates for generalized polynomial chaos expansions in UQ , 2018, Analysis and Applications.

[8]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[9]  Arnulf Jentzen,et al.  Solving high-dimensional partial differential equations using deep learning , 2017, Proceedings of the National Academy of Sciences.

[10]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[11]  P. Grisvard Elliptic Problems in Nonsmooth Domains , 1985 .

[12]  Error bounds for approximations with deep ReLU neural networks in general norms , 2018 .

[13]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[14]  Qiang Du,et al.  Deep ReLU networks lessen the curse of dimensionality , 2017 .

[15]  Arnulf Jentzen,et al.  DNN Expression Rate Analysis of High-Dimensional PDEs: Application to Option Pricing , 2018, Constructive Approximation.

[16]  Jöran Bergh,et al.  General Properties of Interpolation Spaces , 1976 .

[17]  Jinchao Xu,et al.  Relu Deep Neural Networks and Linear Finite Elements , 2018, Journal of Computational Mathematics.

[18]  Nadav Cohen,et al.  On the Expressive Power of Deep Learning: A Tensor Analysis , 2015, COLT 2016.

[19]  P. Oswald On the degree of nonlinear spline approximation in Besov-Sobolev spaces , 1990 .

[20]  J. Cooper SINGULAR INTEGRALS AND DIFFERENTIABILITY PROPERTIES OF FUNCTIONS , 1973 .

[21]  Philipp Petersen,et al.  Topological Properties of the Set of Functions Generated by Neural Networks of Fixed Size , 2018, Found. Comput. Math..

[22]  E Weinan,et al.  The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving Variational Problems , 2017, Communications in Mathematics and Statistics.

[23]  Sebastian Becker,et al.  Solving stochastic differential equations and Kolmogorov equations by means of deep learning , 2018, ArXiv.

[24]  Alexander Cloninger,et al.  Provable approximation properties for deep neural networks , 2015, ArXiv.

[25]  E Weinan,et al.  Deep Learning-Based Numerical Methods for High-Dimensional Parabolic Partial Differential Equations and Backward Stochastic Differential Equations , 2017, Communications in Mathematics and Statistics.

[26]  Helmut Bölcskei,et al.  Optimal Approximation with Sparsely Connected Deep Neural Networks , 2017, SIAM J. Math. Data Sci..

[27]  Alex Graves,et al.  Decoupled Neural Interfaces using Synthetic Gradients , 2016, ICML.

[28]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[29]  Razvan Pascanu,et al.  On the Number of Linear Regions of Deep Neural Networks , 2014, NIPS.

[30]  Raman Arora,et al.  Understanding Deep Neural Networks with Rectified Linear Units , 2016, Electron. Colloquium Comput. Complex..

[31]  G. M.,et al.  Partial Differential Equations I , 2023, Applied Mathematical Sciences.

[32]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[33]  Stéphane Mallat,et al.  Group Invariant Scattering , 2011, ArXiv.

[34]  Ohad Shamir,et al.  Depth-Width Tradeoffs in Approximating Natural Functions with Neural Networks , 2016, ICML.

[35]  Hyunjoong Kim,et al.  Functional Analysis I , 2017 .

[36]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Mark J. F. Gales,et al.  Stimulated Deep Neural Network for Speech Recognition , 2016, INTERSPEECH.

[38]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[39]  Tomaso A. Poggio,et al.  When and Why Are Deep Networks Better Than Shallow Ones? , 2017, AAAI.

[40]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[42]  E. Valdinoci,et al.  Nonlocal Diffusion and Applications , 2015, 1504.08292.

[43]  Qiang Du,et al.  New Error Bounds for Deep ReLU Networks Using Sparse Grids , 2017, SIAM J. Math. Data Sci..

[44]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  A. Barron Approximation and Estimation Bounds for Artificial Neural Networks , 1991, COLT '91.

[46]  J. Marsden,et al.  Elementary classical analysis , 1974 .

[47]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[48]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[49]  H. Brezis Functional Analysis, Sobolev Spaces and Partial Differential Equations , 2010 .

[50]  L. Tartar An Introduction to Sobolev Spaces and Interpolation Spaces , 2007 .

[51]  Razvan Pascanu,et al.  Sobolev Training for Neural Networks , 2017, NIPS.

[52]  P. Bassanini,et al.  Elliptic Partial Differential Equations of Second Order , 1997 .

[53]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[54]  T. Roubíček Nonlinear partial differential equations with applications , 2005 .

[55]  H. Triebel Interpolation Theory, Function Spaces, Differential Operators , 1978 .

[56]  WiSe,et al.  Analysis III , 2017 .

[57]  Philipp Petersen,et al.  Optimal approximation of piecewise smooth functions using deep ReLU neural networks , 2017, Neural Networks.

[58]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint: Index , 2007 .

[59]  H. N. Mhaskar,et al.  Neural Networks for Optimal Approximation of Smooth and Analytic Functions , 1996, Neural Computation.

[60]  Dmitry Yarotsky,et al.  Error bounds for approximations with deep ReLU networks , 2016, Neural Networks.

[61]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[62]  Helmut Bölcskei,et al.  Deep Neural Network Approximation Theory , 2019, IEEE Transactions on Information Theory.

[63]  Justin A. Sirignano,et al.  DGM: A deep learning algorithm for solving partial differential equations , 2017, J. Comput. Phys..

[64]  Juncai He sci Relu Deep Neural Networks and Linear Finite Elements , 2020 .

[65]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[66]  Dimitrios I. Fotiadis,et al.  Artificial neural networks for solving ordinary and partial differential equations , 1997, IEEE Trans. Neural Networks.