A comparison of statistical emulation methodologies for multi‐wave calibration of environmental models

Expensive computer codes, particularly those used for simulating environmental or geological processes, such as climate models, require calibration (sometimes called tuning). When calibrating expensive simulators using uncertainty quantification methods, it is usually necessary to use a statistical model called an emulator in place of the computer code when running the calibration algorithm. Though emulators based on Gaussian processes are typically many orders of magnitude faster to evaluate than the simulator they mimic, many applications have sought to speed up the computations by using regression‐only emulators within the calculations instead, arguing that the extra sophistication brought using the Gaussian process is not worth the extra computational power. This was the case for the analysis that produced the UK climate projections in 2009. In this paper, we compare the effectiveness of both emulation approaches upon a multi‐wave calibration framework that is becoming popular in the climate modeling community called “history matching.” We find that Gaussian processes offer significant benefits to the reduction of parametric uncertainty over regression‐only approaches. We find that in a multi‐wave experiment, a combination of regression‐only emulators initially, followed by Gaussian process emulators for refocussing experiments can be nearly as effective as using Gaussian processes throughout for a fraction of the computational cost. We also discover a number of design and emulator‐dependent features of the multi‐wave history matching approach that can cause apparent, yet premature, convergence of our estimates of parametric uncertainty. We compare these approaches to calibration in idealized examples and apply it to a well‐known geological reservoir model.

[1]  Jeremy E. Oakley,et al.  Bayesian History Matching of Complex Infectious Disease Models Using Emulation: A Tutorial and a Case Study on HIV in Uganda , 2015, PLoS Comput. Biol..

[2]  James M. Salter,et al.  Identifying and removing structural biases in climate models with history matching , 2015, Climate Dynamics.

[3]  Brian Williams,et al.  A Bayesian calibration approach to the thermal problem , 2008 .

[4]  E. Hawkins,et al.  The Potential to Narrow Uncertainty in Regional Climate Predictions , 2009 .

[5]  B. A. Worley Deterministic uncertainty analysis , 1987 .

[6]  Jonathan Carter,et al.  Errors in History Matching , 2004 .

[7]  Derek Bingham,et al.  Efficient emulators of computer experiments using compactly supported correlation functions, with an application to cosmology , 2011, 1107.0749.

[8]  G. Mann,et al.  The magnitude and causes of uncertainty in global model simulations of cloud condensation nuclei , 2013 .

[9]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[10]  D. Williamson,et al.  Exploratory ensemble designs for environmental models using k-extended Latin Hypercubes , 2015, Environmetrics.

[11]  John F. B. Mitchell,et al.  The simulation of SST, sea ice extents and ocean heat transports in a version of the Hadley Centre coupled model without flux adjustments , 2000 .

[12]  V. Pope,et al.  The impact of new physical parametrizations in the Hadley Centre climate model: HadAM3 , 2000 .

[13]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[14]  Andrew Gettelman,et al.  The Art and Science of Climate Model Tuning , 2017 .

[15]  Robert B. Gramacy,et al.  Calibrating a large computer experiment simulating radiative shock hydrodynamics , 2014, 1410.3293.

[16]  M. Webb,et al.  Quantification of modelling uncertainties in a large ensemble of climate change simulations , 2004, Nature.

[17]  Jonathan Rougier,et al.  Analyzing the Climate Sensitivity of the HadSM3 Climate Model Using Ensembles from Different but Related Experiments , 2009 .

[18]  Michael Goldstein,et al.  Small Sample Bayesian Designs for Complex High-Dimensional Models Based on Information Gained Using Fast Approximations , 2009, Technometrics.

[19]  T. J. Mitchell,et al.  Bayesian design and analysis of computer experiments: Use of derivatives in surface prediction , 1993 .

[20]  T. J. Mitchell,et al.  Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments , 1991 .

[21]  Jonathan M. Gregory,et al.  Calibrated prediction of Pine Island Glacier retreat during the 21st and 22nd centuries with a coupled flowline model , 2012 .

[22]  Maryse Labriet,et al.  PLASIM-ENTSem v1.0: a spatio-temporal emulator of future climate change for impacts assessment , 2013 .

[23]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[24]  Daniel B. Williamson,et al.  Evolving Bayesian Emulators for Structured Chaotic Time Series, with Application to Large Climate Models , 2014, SIAM/ASA J. Uncertain. Quantification.

[25]  Jerome Sacks,et al.  Designs for Computer Experiments , 1989 .

[26]  Daniel B. Williamson,et al.  Tuning without over-tuning: parametric uncertainty quantification for the NEMO ocean model , 2016 .

[27]  Ian Vernon,et al.  Galaxy formation : a Bayesian uncertainty analysis. , 2010 .

[28]  Robert B. Gramacy,et al.  Cases for the nugget in modeling computer experiments , 2010, Statistics and Computing.

[29]  Jonathan Carter,et al.  An analysis of history matching errors , 2005 .

[30]  Thomas E. Fricker,et al.  Multivariate Emulators with Nonseparable Covariance Structures , 2010 .

[31]  James R. Gattiker,et al.  The potential of an observational data set for calibration of a computationally expensive computer model , 2013 .

[32]  Michael A. West,et al.  A dynamic modelling strategy for Bayesian computer model emulation , 2009 .

[33]  J. Rougier,et al.  Precalibrating an intermediate complexity climate model , 2018 .

[34]  D. Klocke,et al.  Tuning the climate of a global model , 2012 .

[35]  M. Webb,et al.  Multivariate probabilistic projections using imperfect climate models part I: outline of methodology , 2012, Climate Dynamics.

[36]  Michael Goldstein,et al.  History matching for exploring and reducing climate model parameter space using observations and a large perturbed physics ensemble , 2013, Climate Dynamics.

[37]  D. Higdon,et al.  Computer Model Calibration Using High-Dimensional Output , 2008 .

[38]  Jonathan Rougier,et al.  Probabilistic Inference for Future Climate Using an Ensemble of Climate Model Evaluations , 2007 .

[39]  J. Rougier Efficient Emulators for Multivariate Deterministic Functions , 2008 .

[40]  A. O'Hagan,et al.  Bayesian emulation of complex multi-output and dynamic computer models , 2010 .

[41]  Peter Challenor,et al.  Computational Statistics and Data Analysis the Effect of the Nugget on Gaussian Process Emulators of Computer Models , 2022 .

[42]  T. J. Mitchell,et al.  Exploratory designs for computational experiments , 1995 .

[43]  Jeremy E. Oakley,et al.  Multivariate Gaussian Process Emulators With Nonseparable Covariance Structures , 2013, Technometrics.

[44]  F. Pukelsheim The Three Sigma Rule , 1994 .

[45]  Michael Goldstein,et al.  Bayesian Forecasting for Complex Systems Using Computer Simulators , 2001 .

[46]  M. J. Bayarri,et al.  Computer model validation with functional output , 2007, 0711.3271.