Neural-net-induced Gaussian process regression for function approximation and PDE solution

Abstract Neural-net-induced Gaussian process (NNGP) regression inherits both the high expressivity of deep neural networks (deep NNs) as well as the uncertainty quantification property of Gaussian processes (GPs). We generalize the current NNGP to first include a larger number of hyperparameters and subsequently train the model by maximum likelihood estimation. Unlike previous works on NNGP that targeted classification, here we apply the generalized NNGP to function approximation and to solving partial differential equations (PDEs). Specifically, we develop an analytical iteration formula to compute the covariance function of GP induced by deep NN with an error-function nonlinearity. We compare the performance of the generalized NNGP for function approximations and PDE solutions with those of GPs and fully-connected NNs. We observe that for smooth functions the generalized NNGP can yield the same order of accuracy with GP, while both NNGP and GP outperform deep NN. For non-smooth functions, the generalized NNGP is superior to GP and comparable or superior to deep NN.

[1]  Surya Ganguli,et al.  Exponential expressivity in deep neural networks through transient chaos , 2016, NIPS.

[2]  Paris Perdikaris,et al.  Machine learning of linear differential equations using Gaussian processes , 2017, J. Comput. Phys..

[3]  Paris Perdikaris,et al.  Inferring solutions of differential equations using noisy multi-fidelity data , 2016, J. Comput. Phys..

[4]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[5]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[6]  W. J. Whiten,et al.  Computational investigations of low-discrepancy sequences , 1997, TOMS.

[7]  Loic Le Gratiet,et al.  Multi-fidelity Gaussian process regression for computer experiments , 2013 .

[8]  C. Basdevant,et al.  Spectral and finite difference solutions of the Burgers equation , 1986 .

[9]  Wei Cai,et al.  Discovering variable fractional orders of advection-dispersion equations from field data using multi-fidelity Bayesian optimization , 2017, J. Comput. Phys..

[10]  Leslie Greengard,et al.  Fast Direct Methods for Gaussian Processes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Paris Perdikaris,et al.  Numerical Gaussian Processes for Time-Dependent and Nonlinear Partial Differential Equations , 2017, SIAM J. Sci. Comput..

[12]  Le Song,et al.  On the Complexity of Learning Neural Networks , 2017, NIPS.

[13]  Carl E. Rasmussen,et al.  Gaussian Processes for Machine Learning (GPML) Toolbox , 2010, J. Mach. Learn. Res..

[14]  Jaehoon Lee,et al.  Deep Neural Networks as Gaussian Processes , 2017, ICLR.

[15]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[16]  Sommers,et al.  Chaos in random neural networks. , 1988, Physical review letters.

[17]  Christopher K. I. Williams Computing with Infinite Networks , 1996, NIPS.

[18]  Paris Perdikaris,et al.  Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations , 2017, ArXiv.

[19]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[20]  Paris Perdikaris,et al.  Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations , 2017, ArXiv.

[21]  Alexander Litvinenko,et al.  Likelihood approximation with hierarchical matrices for large spatial datasets , 2017, Comput. Stat. Data Anal..

[22]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[23]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[24]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.