Accelerating Monte Carlo Markov chains with proxy and error models

In groundwater modeling, Monte Carlo Markov Chain (MCMC) simulations are often used to calibrate aquifer parameters and propagate the uncertainty to the quantity of interest (e.g., pollutant concentration). However, this approach requires a large number of flow simulations and incurs high computational cost, which prevents a systematic evaluation of the uncertainty in the presence of complex physical processes. To avoid this computational bottleneck, we propose to use an approximate model (proxy) to predict the response of the exact model. Here, we use a proxy that entails a very simplified description of the physics with respect to the detailed physics described by the "exact" model. The error model accounts for the simplification of the physical process; and it is trained on a learning set of realizations, for which both the proxy and exact responses are computed. First, the key features of the set of curves are extracted using functional principal component analysis; then, a regression model is built to characterize the relationship between the curves. The performance of the proposed approach is evaluated on the Imperial College Fault model. We show that the joint use of the proxy and the error model to infer the model parameters in a two-stage MCMC set-up allows longer chains at a comparable computational cost. Unnecessary evaluations of the exact responses are avoided through a preliminary evaluation of the proposal made on the basis of the corrected proxy response. The error model trained on the learning set is crucial to provide a sufficiently accurate prediction of the exact response and guide the chains to the low misfit regions. The proposed methodology can be extended to multiple-chain algorithms or other Bayesian inference methods. Moreover, FPCA is not limited to the specific presented application and offers a general framework to build error models.

[1]  H. Kaiser The varimax criterion for analytic rotation in factor analysis , 1958 .

[2]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[3]  Jonathan Carter,et al.  An analysis of history matching errors , 2005 .

[4]  Brent Henderson,et al.  Exploring between site differences in water quality trends: a functional data analysis approach , 2006 .

[5]  M. Richman,et al.  Rotation of principal components , 1986 .

[6]  Sanford Weisberg,et al.  An R Companion to Applied Regression , 2010 .

[7]  Ivan Lunati,et al.  Local and Global Error Models to Improve Uncertainty Quantification , 2013, Mathematical Geosciences.

[8]  Yalchin Efendiev,et al.  Preconditioning Markov Chain Monte Carlo Simulations Using Coarse-Scale Models , 2006, SIAM J. Sci. Comput..

[9]  David Ginsbourger,et al.  Functional error modeling for uncertainty quantification in hydrogeology , 2015 .

[10]  D. Mallants,et al.  Efficient posterior exploration of a high‐dimensional groundwater model from two‐stage Markov chain Monte Carlo simulation and polynomial chaos expansion , 2013 .

[11]  Michael Andrew Christie,et al.  Solution Error Models: A New Approach for Coarse Grid History Matching , 2005 .

[12]  Tiangang Cui,et al.  Adaptive Error Modelling in MCMC Sampling for Large Scale Inverse Problems , 2011 .

[13]  A. Gelman,et al.  Weak convergence and optimal scaling of random walk Metropolis algorithms , 1997 .

[14]  Spencer Graves,et al.  Functional Data Analysis with R and MATLAB , 2009 .

[15]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[16]  Mike Christie,et al.  Uncertainty quantification for porous media flows , 2006, J. Comput. Phys..

[17]  Yalchin Efendiev,et al.  Efficient sampling techniques for uncertainty quantification in history matching using nonlinear error models and ensemble level upscaling techniques , 2009 .

[18]  T. Hou,et al.  Multiscale finite element methods for stochastic porous media flow equations and application to uncertainty quantification , 2008 .

[19]  C. Fox,et al.  Markov chain Monte Carlo Using an Approximation , 2005 .

[20]  R. Carnell Latin Hypercube Samples , 2016 .

[21]  Michael Andrew Christie,et al.  Detection of Optimal Models in Parameter Space with Support Vector Machines , 2010 .

[22]  Peter R. King,et al.  Our calibrated model has poor predictive value: An example from the petroleum industry , 2006, Reliab. Eng. Syst. Saf..

[23]  J. Skilling Nested sampling for general Bayesian computation , 2006 .

[24]  S. Weisberg Applied Linear Regression , 1981 .

[25]  H. Tchelepi,et al.  Multi-scale finite-volume method for elliptic problems in subsurface flow simulation , 2003 .

[26]  Jonathan Carter,et al.  Errors in History Matching , 2004 .

[27]  Michael Andrew Christie,et al.  History Matching and Uncertainty Quantification: Multiobjective Particle Swarm Optimisation Approach , 2011 .

[28]  B. Silverman,et al.  Functional Data Analysis , 1997 .

[29]  Michael Andrew Christie,et al.  Population MCMC methods for history matching and uncertainty quantification , 2010, Computational Geosciences.

[30]  T. LaForce,et al.  Bayesian Reservoir History Matching Considering Model and Parameter Uncertainties , 2012, Mathematical Geosciences.

[31]  Yalchin Efendiev,et al.  An efficient two‐stage Markov chain Monte Carlo method for dynamic data integration , 2005 .

[32]  Mary F. Wheeler,et al.  Efficient Bayesian inference of subsurface flow models using nested sampling and sparse polynomial chaos surrogates , 2014 .

[33]  Louis J. Durlofsky,et al.  Rapid Construction of Ensembles of High-resolution Reservoir Models Constrained to Production Data , 2010 .

[34]  A. O’Sullivan,et al.  Error models for reducing history match bias , 2006 .

[35]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[36]  Dongxiao Zhang,et al.  A sparse grid based Bayesian method for contaminant source identification , 2012 .