Variational Monte Carlo—bridging concepts of machine learning and high-dimensional partial differential equations

A statistical learning approach for parametric PDEs related to Uncertainty Quantification is derived. The method is based on the minimization of an empirical risk on a selected model class and it is shown to be applicable to a broad range of problems. A general unified convergence analysis is derived, which takes into account the approximation and the statistical errors. By this, a combination of theoretical results from numerical analysis and statistics is obtained. Numerical experiments illustrate the performance of the method with the model class of hierarchical tensors.

[1]  Albert Cohen,et al.  Sequential Sampling for Optimal Weighted Least Squares Approximations in Hierarchical Spaces , 2018, SIAM J. Math. Data Sci..

[2]  Michael Griebel,et al.  Error Estimates for Multivariate Regression on Discretized Function Spaces , 2017, SIAM J. Numer. Anal..

[3]  Claude Jeffrey Gittelson,et al.  A convergent adaptive stochastic Galerkin finite element method with quasi-optimal spatial meshes , 2013 .

[4]  Catherine Elizabeth Powell,et al.  Energy Norm A Posteriori Error Estimation for Parametric Operator Equations , 2014, SIAM J. Sci. Comput..

[5]  W. Hackbusch,et al.  A New Scheme for the Tensor Representation , 2009 .

[6]  Aryan Mokhtari,et al.  First-Order Adaptive Sample Size Methods to Reduce Complexity of Empirical Risk Minimization , 2017, NIPS.

[7]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[8]  BabuskaIvo,et al.  A Stochastic Collocation Method for Elliptic Partial Differential Equations with Random Input Data , 2007 .

[9]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[10]  I. Babuska,et al.  Solution of stochastic partial differential equations using Galerkin finite element techniques , 2001 .

[11]  V. Temlyakov Approximation in Learning Theory , 2008 .

[12]  Helmut Bölcskei,et al.  Optimal Approximation with Sparsely Connected Deep Neural Networks , 2017, SIAM J. Math. Data Sci..

[13]  Omar M. Knio,et al.  Spectral Methods for Uncertainty Quantification , 2010 .

[14]  Albert Cohen,et al.  On the Stability and Accuracy of Least Squares Approximations , 2011, Foundations of Computational Mathematics.

[15]  Philippe G. Ciarlet,et al.  Lectures on The Finite Element Method , 1975 .

[16]  B. Øksendal Stochastic Differential Equations , 1985 .

[17]  Reinhold Schneider,et al.  Adaptive stochastic Galerkin FEM with hierarchical tensor representations , 2015, Numerische Mathematik.

[18]  O. L. Maître,et al.  Spectral Methods for Uncertainty Quantification: With Applications to Computational Fluid Dynamics , 2010 .

[19]  Hermann G. Matthies,et al.  Galerkin methods for linear and nonlinear elliptic stochastic partial differential equations , 2005 .

[20]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, CACM.

[21]  Reinhold Schneider,et al.  Adaptive stochastic Galerkin FEM for lognormal coefficients in hierarchical tensor representations , 2018, Numerische Mathematik.

[22]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[23]  Albert Cohen,et al.  Convergence Rates of Best N-term Galerkin Approximations for a Class of Elliptic sPDEs , 2010, Found. Comput. Math..

[24]  Claude Jeffrey Gittelson,et al.  Adaptive stochastic Galerkin FEM , 2014 .

[25]  W. Hackbusch Tensor Spaces and Numerical Tensor Calculus , 2012, Springer Series in Computational Mathematics.

[26]  Thorsten Ohl,et al.  Vegas revisited : Adaptive Monte Carlo integration beyond factorization , 1998, hep-ph/9806432.

[27]  R. Ghanem,et al.  Stochastic Finite Elements: A Spectral Approach , 1990 .

[28]  Amnon Shashua,et al.  Convolutional Rectifier Networks as Generalized Tensor Decompositions , 2016, ICML.

[29]  Thomas Hofmann,et al.  Starting Small - Learning with Adaptive Sample Sizes , 2016, ICML.

[30]  Anthony Nouy,et al.  Low-rank methods for high-dimensional approximation and model order reduction , 2015, 1511.01554.

[31]  G. Lepage,et al.  VEGAS - an adaptive multi-dimensional integration program , 1980 .

[32]  Reinhold Schneider,et al.  Tensor Networks and Hierarchical Tensors for the Solution of High-Dimensional Partial Differential Equations , 2016, Foundations of Computational Mathematics.

[33]  Yann LeCun,et al.  Deep learning with Elastic Averaging SGD , 2014, NIPS.

[34]  Fabio Nobile,et al.  An Anisotropic Sparse Grid Stochastic Collocation Method for Partial Differential Equations with Random Input Data , 2008, SIAM J. Numer. Anal..

[35]  David Haussler,et al.  Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications , 1992, Inf. Comput..

[36]  Mark W. Schmidt,et al.  StopWasting My Gradients: Practical SVRG , 2015, NIPS.

[37]  Catherine E. Powell,et al.  An Introduction to Computational Stochastic PDEs , 2014 .

[38]  Anthony Nouy,et al.  Chapter 4: Low-Rank Methods for High-Dimensional Approximation and Model Order Reduction , 2017 .

[39]  Albert Cohen,et al.  Approximation of high-dimensional parametric PDEs * , 2015, Acta Numerica.

[40]  L. R. Scott,et al.  The Mathematical Theory of Finite Element Methods , 1994 .

[41]  J. Guermond,et al.  Theory and practice of finite elements , 2004 .

[42]  V. Vapnik,et al.  Necessary and Sufficient Conditions for the Uniform Convergence of Means to their Expectations , 1982 .

[43]  Martin Eigel,et al.  An Adaptive Multilevel Monte Carlo Method with Stochastic Bounds for Quantities of Interest with Uncertain Data , 2016, SIAM/ASA J. Uncertain. Quantification.

[44]  Ralph C. Smith,et al.  Uncertainty Quantification: Theory, Implementation, and Applications , 2013 .

[45]  E Weinan,et al.  The Deep Ritz Method: A Deep Learning-Based Numerical Algorithm for Solving Variational Problems , 2017, Communications in Mathematics and Statistics.

[46]  Claude Jeffrey Gittelson,et al.  Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs* , 2011, Acta Numerica.

[47]  Reinhold Schneider,et al.  Low rank tensor recovery via iterative hard thresholding , 2016, ArXiv.

[48]  Wolfgang Dahmen,et al.  Parametric PDEs: sparse or low-rank approximations? , 2016, 1607.04444.

[49]  Reinhold Schneider,et al.  Non-intrusive Tensor Reconstruction for High-Dimensional Random PDEs , 2019, Comput. Methods Appl. Math..

[50]  R. DeVore,et al.  Analytic regularity and polynomial approximation of parametric and stochastic elliptic PDEs , 2010 .

[51]  Christoph Schwab,et al.  N-term Wiener chaos approximation rates for elliptic PDEs with lognormal Gaussian random inputs , 2014 .

[52]  Arnulf Jentzen,et al.  Analysis of the generalization error: Empirical risk minimization over deep artificial neural networks overcomes the curse of dimensionality in the numerical approximation of Black-Scholes partial differential equations , 2018, SIAM J. Math. Data Sci..

[53]  Tong Zhang,et al.  Accelerating Stochastic Gradient Descent using Predictive Variance Reduction , 2013, NIPS.

[54]  D. Braess Finite Elements: Theory, Fast Solvers, and Applications in Solid Mechanics , 1995 .

[55]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[56]  Sebastian Becker,et al.  Solving stochastic differential equations and Kolmogorov equations by means of deep learning , 2018, ArXiv.

[57]  Martin Eigel,et al.  Local equilibration error estimators for guaranteed error control in adaptive stochastic higher-order Galerkin FEM , 2014 .

[58]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[59]  Peter L. Bartlett,et al.  Neural Network Learning - Theoretical Foundations , 1999 .

[60]  Arnulf Jentzen,et al.  Solving high-dimensional partial differential equations using deep learning , 2017, Proceedings of the National Academy of Sciences.

[61]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[62]  Felipe Cucker,et al.  Learning Theory: An Approximation Theory Viewpoint: Index , 2007 .

[63]  D. Ceperley,et al.  Monte Carlo simulation of a many-fermion study , 1977 .

[64]  Dishi Liu,et al.  To Be or Not to be Intrusive? The Solution of Parametric and Stochastic Equations - Proper Generalized Decomposition , 2014, SIAM J. Sci. Comput..

[65]  A. Cohen,et al.  Optimal weighted least-squares methods , 2016, 1608.00512.

[66]  Paris Perdikaris,et al.  Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations , 2017, ArXiv.

[67]  Ivan Oseledets,et al.  Tensor-Train Decomposition , 2011, SIAM J. Sci. Comput..

[68]  A. Pinkus n-Widths in Approximation Theory , 1985 .

[69]  B. Silbermann,et al.  Numerical Analysis for Integral and Related Operator Equations , 1991 .

[70]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[71]  G. Pavliotis Stochastic Processes and Applications: Diffusion Processes, the Fokker-Planck and Langevin Equations , 2014 .

[72]  Fabio Nobile,et al.  A Stochastic Collocation Method for Elliptic Partial Differential Equations with Random Input Data , 2007, SIAM Rev..

[73]  Felipe Cucker,et al.  On the mathematical foundations of learning , 2001 .

[74]  Tuan Anh Nguyen,et al.  A proof that rectified deep neural networks overcome the curse of dimensionality in the numerical approximation of semilinear heat equations , 2019, SN Partial Differential Equations and Applications.