On the approximation of the solution of partial differential equations by artificial neural networks trained by a multilevel Levenberg-Marquardt method

This paper is concerned with the approximation of the solution of partial differential equations by means of artificial neural networks. Here a feedforward neural network is used to approximate the solution of the partial differential equation. The learning problem is formulated as a least squares problem, choosing the residual of the partial differential equation as a loss function, whereas a multilevel Levenberg-Marquardt method is employed as a training method. This setting allows us to get further insight into the potential of multilevel methods. Indeed, when the least squares problem arises from the training of artificial neural networks, the variables subject to optimization are not related by any geometrical constraints and the standard interpolation and restriction operators cannot be employed any longer. A heuristic, inspired by algebraic multigrid methods, is then proposed to construct the multilevel transfer operators. Numerical experiments show encouraging results related to the efficiency of the new multilevel optimization method for the training of artificial neural networks, compared to the standard corresponding one-level procedure.

[1]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[2]  C.E. Shannon,et al.  Communication in the Presence of Noise , 1949, Proceedings of the IRE.

[3]  Satish S. Udpa,et al.  Finite-element neural networks for solving differential equations , 2005, IEEE Transactions on Neural Networks.

[4]  Arnulf Jentzen,et al.  Overcoming the curse of dimensionality in the numerical approximation of semilinear parabolic partial differential equations , 2018, Proceedings of the Royal Society A.

[5]  Alan Edelman,et al.  Julia: A Fresh Approach to Numerical Computing , 2014, SIAM Rev..

[6]  S. Nash A multigrid approach to discretized optimization problems , 2000 .

[7]  William L. Briggs,et al.  A multigrid tutorial, Second Edition , 2000 .

[8]  Åke Björck,et al.  Numerical methods for least square problems , 1996 .

[9]  Jorge Nocedal,et al.  A Multi-Batch L-BFGS Method for Machine Learning , 2016, NIPS.

[10]  Wolfgang Hackbusch,et al.  Multi-grid methods and applications , 1985, Springer series in computational mathematics.

[11]  Silvia Ferrari,et al.  A constrained-optimization approach to training neural networks for smooth function approximation and system identification , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[12]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[13]  Keith Rudd Solving Partial Differential Equations Using Artificial Neural Networks , 2013 .

[14]  Mohsen Hayati,et al.  Multilayer perceptron neural networks with novel unsupervised training method for numerical solution of the partial differential equations , 2009, Appl. Soft Comput..

[15]  Michal Kocvara,et al.  A first-order multigrid method for bound-constrained convex optimization , 2016, Optim. Methods Softw..

[16]  Stephen G. Nash Properties of a class of multilevel optimization algorithms for equality-constrained problems , 2014, Optim. Methods Softw..

[17]  Piet Hemker,et al.  Multigrid approaches to the Euler equations , 1986 .

[18]  Arnulf Jentzen,et al.  A proof that deep artificial neural networks overcome the curse of dimensionality in the numerical approximation of Kolmogorov partial differential equations with constant diffusion and nonlinear drift coefficients , 2018, Communications in Mathematical Sciences.

[19]  Stephen G. Nash,et al.  Model Problems for the Multigrid Optimization of Systems Governed by Differential Equations , 2005, SIAM J. Sci. Comput..

[20]  Robert Hecht-Nielsen,et al.  Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.

[21]  Siddhartha Mishra,et al.  A machine learning framework for data driven acceleration of computations of differential equations , 2018, ArXiv.

[22]  Henri Calandra,et al.  On the iterative solution of systems of the form ATA x=ATb+c , 2019, ArXiv.

[23]  Henri Calandra,et al.  On High-Order Multilevel Optimization Strategies , 2019, SIAM J. Optim..

[24]  Dimitrios I. Fotiadis,et al.  Artificial neural networks for solving ordinary and partial differential equations , 1997, IEEE Trans. Neural Networks.

[25]  Stella X. Yu,et al.  Multigrid Neural Architectures , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Paris Perdikaris,et al.  Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations , 2017, ArXiv.

[27]  E Weinan,et al.  The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving Variational Problems , 2017, Communications in Mathematics and Statistics.

[28]  Indranil Saha,et al.  journal homepage: www.elsevier.com/locate/neucom , 2022 .

[29]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[30]  A. Brandt General highly accurate algebraic coarsening. , 2000 .

[31]  Arnulf Jentzen,et al.  Solving high-dimensional partial differential equations using deep learning , 2017, Proceedings of the National Academy of Sciences.

[32]  Steven L. Brunton,et al.  Data-driven discovery of partial differential equations , 2016, Science Advances.

[33]  Nicholas I. M. Gould,et al.  Adaptive cubic regularisation methods for unconstrained optimization. Part I: motivation, convergence and numerical results , 2011, Math. Program..

[34]  Hyuk Lee,et al.  Neural algorithm for solving differential equations , 1990 .

[35]  Paris Perdikaris,et al.  Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations , 2017, ArXiv.

[36]  Tanja Clees,et al.  AMG Strategies for PDE Systems with Applications in Industrial Semiconductor Simulation , 2005 .

[37]  José Mario Martínez,et al.  Worst-case evaluation complexity for unconstrained nonlinear optimization using high-order regularized models , 2017, Math. Program..

[38]  Donald Goldfarb,et al.  A Line Search Multigrid Method for Large-Scale Nonlinear Optimization , 2009, SIAM J. Optim..

[39]  Bin Dong,et al.  PDE-Net: Learning PDEs from Data , 2017, ICML.

[40]  Ehsan Sadrfaridpour,et al.  Engineering fast multilevel support vector machines , 2019, Machine Learning.

[41]  Lars Ruthotto,et al.  Learning Across Scales - Multiscale Methods for Convolution Neural Networks , 2018, AAAI.

[42]  J. W. Ruge,et al.  4. Algebraic Multigrid , 1987 .

[43]  Paris Perdikaris,et al.  Numerical Gaussian Processes for Time-Dependent and Nonlinear Partial Differential Equations , 2017, SIAM J. Sci. Comput..

[44]  Stephen G. Nash,et al.  Using inexact gradients in a multilevel optimization algorithm , 2013, Comput. Optim. Appl..

[45]  Serge Gratton,et al.  Recursive Trust-Region Methods for Multiscale Nonlinear Optimization , 2008, SIAM J. Optim..

[46]  Jorge Nocedal,et al.  Optimization Methods for Large-Scale Machine Learning , 2016, SIAM Rev..

[47]  George E. Karniadakis,et al.  Hidden physics models: Machine learning of nonlinear partial differential equations , 2017, J. Comput. Phys..

[48]  H. Schaeffer,et al.  Learning partial differential equations via data discovery and sparse optimization , 2017, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[49]  Dan Givoli,et al.  Neural network time series forecasting of finite-element mesh adaptation , 2005, Neurocomputing.

[50]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[51]  Yukio Kosugi,et al.  Neural network representation of finite element method , 1994, Neural Networks.