Scaling Up Bayesian Uncertainty Quantification for Inverse Problems using Deep Neural Networks

Due to the importance of uncertainty quantification (UQ), Bayesian approach to inverse problems has recently gained popularity in applied mathematics, physics, and engineering. However, traditional Bayesian inference methods based on Markov Chain Monte Carlo (MCMC) tend to be computationally intensive and inefficient for such high dimensional problems. To address this issue, several methods based on surrogate models have been proposed to speed up the inference process. More specifically, the calibration-emulation-sampling (CES) scheme has been proven to be successful in large dimensional UQ problems. In this work, we propose a novel CES approach for Bayesian inference based on deep neural network (DNN) models for the emulation phase. The resulting algorithm is not only computationally more efficient, but also less sensitive to the training set. Further, by using an Autoencoder (AE) for dimension reduction, we have been able to speed up our Bayesian inference method up to three orders of magnitude. Overall, our method, henceforth called Dimension-Reduced Emulative Autoencoder Monte Carlo (DREAM) algorithm, is able to scale Bayesian UQ up to thousands of dimensions in physics-constrained inverse problems. Using two low-dimensional (linear and nonlinear) inverse problems we illustrate the validity this approach. Next, we apply our method to two high-dimensional numerical examples (elliptic and advection-diffussion) to demonstrate its computational advantage over existing algorithms. Keywords— Bayesian Inverse Problems, Ensemble Kalman Methods, Emulation, Convolutional Neural Network, Dimension Reduction, Autoencoder

[1]  N. Petra,et al.  Model Variational Inverse Problems Governed by Partial Differential Equations , 2011 .

[2]  Andrew M. Stuart,et al.  Interacting Langevin Diffusions: Gradient Structure and Ensemble Kalman Sampler , 2019, SIAM J. Appl. Dyn. Syst..

[3]  Dave Higdon,et al.  Combining Field Data and Computer Simulations for Calibration and Prediction , 2005, SIAM J. Sci. Comput..

[4]  D. Oliver,et al.  Ensemble Randomized Maximum Likelihood Method as an Iterative Ensemble Smoother , 2011, Mathematical Geosciences.

[5]  Yee Whye Teh,et al.  Bayesian Learning via Stochastic Gradient Langevin Dynamics , 2011, ICML.

[6]  Gareth Roberts,et al.  Optimal scalings for local Metropolis--Hastings chains on nonproduct targets in high dimensions , 2009, 0908.0865.

[7]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[8]  Jürgen Schmidhuber,et al.  Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.

[9]  Marco A. Iglesias,et al.  A regularizing iterative ensemble Kalman method for PDE-constrained inverse problems , 2015, 1505.03876.

[10]  Geir Evensen,et al.  The Ensemble Kalman Filter: theoretical formulation and practical implementation , 2003 .

[11]  Omar Ghattas,et al.  hIPPYlib: An Extensible Software Framework for Large-Scale Inverse Problems , 2018, J. Open Source Softw..

[12]  Jason M. Klusowski,et al.  Uniform Approximation by Neural Networks Activated by First and Second Order Ridge Splines , 2016 .

[13]  Gemma Stephenson,et al.  Using derivative information in the statistical analysis of computer models , 2010 .

[14]  Dean S. Oliver,et al.  THE ENSEMBLE KALMAN FILTER IN RESERVOIR ENGINEERING-A REVIEW , 2009 .

[15]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[16]  Kunihiko Fukushima,et al.  Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.

[17]  Andrew M. Stuart,et al.  Tikhonov Regularization within Ensemble Kalman Inversion , 2019, SIAM J. Numer. Anal..

[18]  Babak Shahbaba,et al.  Deep Markov Chain Monte Carlo , 2019, 1910.05692.

[19]  Demis Hassabis,et al.  Improved protein structure prediction using potentials from deep learning , 2020, Nature.

[20]  Andrew M. Stuart,et al.  Analysis of the Ensemble Kalman Filter for Inverse Problems , 2016, SIAM J. Numer. Anal..

[21]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[22]  J. M. Sanz-Serna,et al.  Hybrid Monte Carlo on Hilbert spaces , 2011 .

[23]  Christoph Koch,et al.  Multi-resolution convolutional neural networks for inverse problems , 2018, Scientific Reports.

[24]  L. Verlet Computer "Experiments" on Classical Fluids. I. Thermodynamical Properties of Lennard-Jones Molecules , 1967 .

[25]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[26]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[27]  Allan Pinkus,et al.  Approximation theory of the MLP model in neural networks , 1999, Acta Numerica.

[28]  Andrew M. Stuart,et al.  Inverse problems: A Bayesian perspective , 2010, Acta Numerica.

[29]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[30]  Geir Evensen,et al.  Analysis of iterative ensemble smoothers for solving inverse problems , 2018, Computational Geosciences.

[31]  Heitor Silvério Lopes,et al.  A study of deep convolutional auto-encoders for anomaly detection in videos , 2018, Pattern Recognit. Lett..

[32]  P. Houtekamer,et al.  A Sequential Ensemble Kalman Filter for Atmospheric Data Assimilation , 2001 .

[33]  Fuqing Zhang,et al.  Review of the Ensemble Kalman Filter for Atmospheric Data Assimilation , 2016 .

[34]  M. Girolami,et al.  Markov Chain Monte Carlo from Lagrangian Dynamics , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[35]  Wilhelm Stannat,et al.  Long-Time Stability and Accuracy of the Ensemble Kalman-Bucy Filter for Fully Observed Processes and Small Measurement Noise , 2016, SIAM J. Appl. Dyn. Syst..

[36]  Alexandros Beskos A stable manifold MCMC method for high dimensions , 2014 .

[37]  Andrew M. Stuart,et al.  Geometric MCMC for infinite-dimensional inverse problems , 2016, J. Comput. Phys..

[38]  A. O'Hagan,et al.  Probabilistic sensitivity analysis of complex models: a Bayesian approach , 2004 .

[39]  Christopher K. Wikle,et al.  Atmospheric Modeling, Data Assimilation, and Predictability , 2005, Technometrics.

[40]  Radford M. Neal MCMC Using Hamiltonian Dynamics , 2011, 1206.1901.

[41]  Diederik P. Kingma,et al.  An Introduction to Variational Autoencoders , 2019, Found. Trends Mach. Learn..

[42]  George Cybenko,et al.  Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..

[43]  Y. Makovoz Random Approximants and Neural Networks , 1996 .

[44]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[45]  Shiwei Lan,et al.  Adaptive dimension reduction to accelerate infinite-dimensional geometric Markov Chain Monte Carlo , 2018, J. Comput. Phys..

[46]  A. OHagan,et al.  Bayesian analysis of computer code outputs: A tutorial , 2006, Reliab. Eng. Syst. Saf..

[47]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[48]  Andrew M. Stuart,et al.  Ensemble Kalman inversion: a derivative-free technique for machine learning tasks , 2018, Inverse Problems.

[49]  Albert C. Reynolds,et al.  Investigation of the sampling performance of ensemble-based methods with a simple reservoir model , 2013, Computational Geosciences.

[50]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[51]  A. O'Hagan,et al.  Bayesian inference for the uncertainty distribution of computer model outputs , 2002 .

[52]  Jürgen Schmidhuber,et al.  Biologically Plausible Speech Recognition with LSTM Neural Nets , 2004, BioADIT.

[53]  A. Stuart,et al.  The Bayesian Approach to Inverse Problems , 2013, 1302.6989.

[54]  Mark A. Girolami,et al.  Emulation of higher-order tensors in manifold Monte Carlo methods for Bayesian Inverse Problems , 2015, J. Comput. Phys..

[55]  Ning Liu,et al.  Inverse Theory for Petroleum Reservoir Characterization and History Matching , 2008 .

[56]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[57]  Liwei Wang,et al.  The Expressive Power of Neural Networks: A View from the Width , 2017, NIPS.

[58]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[59]  Ding-Xuan Zhou,et al.  Universality of Deep Convolutional Neural Networks , 2018, Applied and Computational Harmonic Analysis.

[60]  Andrew Stuart,et al.  Convergence analysis of ensemble Kalman inversion: the linear, noisy case , 2017, 1702.07894.

[61]  Kody J. H. Law Proposals which speed up function-space MCMC , 2014, J. Comput. Appl. Math..

[62]  G. Roberts,et al.  MCMC Methods for Functions: ModifyingOld Algorithms to Make Them Faster , 2012, 1202.0709.

[63]  En Zhu,et al.  Deep Clustering with Convolutional Autoencoders , 2017, ICONIP.

[64]  G. Evensen,et al.  Assimilation of Geosat altimeter data for the Agulhas current using the ensemble Kalman filter with , 1996 .

[65]  Dean S. Oliver,et al.  Inverse Theory for Petroleum Reservoir Characterization and History Matching: Descriptive geostatistics , 2008 .

[66]  Babak Shahbaba,et al.  Spherical Hamiltonian Monte Carlo for Constrained Target Distributions , 2013, ICML.

[67]  Pierre Baldi,et al.  Deep autoencoder neural networks for gene ontology annotation predictions , 2014, BCB.

[68]  A. Stuart,et al.  Ensemble Kalman methods for inverse problems , 2012, 1209.2736.

[69]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[70]  G. Roberts,et al.  MCMC methods for diffusion bridges , 2008 .

[71]  Michael Unser,et al.  Convolutional Neural Networks for Inverse Problems in Imaging: A Review , 2017, IEEE Signal Processing Magazine.

[72]  Jürgen Schmidhuber,et al.  LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.

[73]  Léon Bottou,et al.  On-line learning and stochastic approximations , 1999 .

[74]  Tiangang Cui,et al.  Dimension-independent likelihood-informed MCMC , 2014, J. Comput. Phys..

[75]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[76]  Michael Unser,et al.  Deep Convolutional Neural Network for Inverse Problems in Imaging , 2016, IEEE Transactions on Image Processing.

[77]  Yoshua. Bengio,et al.  Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..

[78]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[79]  A. Stuart,et al.  Data Assimilation: A Mathematical Introduction , 2015, 1506.07825.

[80]  G. Evensen Data Assimilation: The Ensemble Kalman Filter , 2006 .

[81]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[82]  G. Evensen Sequential data assimilation with a nonlinear quasi‐geostrophic model using Monte Carlo methods to forecast error statistics , 1994 .

[83]  Geoffrey Zweig,et al.  Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding , 2015, IEEE/ACM Transactions on Audio, Speech, and Language Processing.