The potential of an observational data set for calibration of a computationally expensive computer model

Abstract. We measure the potential of an observational data set to constrain a set of inputs to a complex and computationally expensive computer model. We use each member in turn of an ensemble of output from a computationally expensive model, corresponding to an observable part of a modelled system, as a proxy for an observational data set. We argue that, given some assumptions, our ability to constrain uncertain parameter inputs to a model using its own output as data, provides a maximum bound for our ability to constrain the model inputs using observations of the real system. The ensemble provides a set of known parameter input and model output pairs, which we use to build a computationally efficient statistical proxy for the full computer model, termed an emulator. We use the emulator to find and rule out "implausible" values for the inputs of held-out ensemble members, given the computer model output. As we know the true values of the inputs for the ensemble, we can compare our constraint of the model inputs with the true value of the input for any ensemble member. Measures of the quality of constraint have the potential to inform strategy for data collection campaigns, before any real-world data is collected, as well as acting as an effective sensitivity analysis. We use an ensemble of the ice sheet model Glimmer to demonstrate our measures of quality of constraint. The ensemble has 250 model runs with 5 uncertain input parameters, and an output variable representing the pattern of the thickness of ice over Greenland. We have an observation of historical ice sheet thickness that directly matches the output variable, and offers an opportunity to constrain the model. We show that different ways of summarising our output variable (ice volume, ice surface area and maximum ice thickness) offer different potential constraints on individual input parameters. We show that combining the observational data gives increased power to constrain the model. We investigate the impact of uncertainty in observations or in model biases on our measures, showing that even a modest uncertainty can seriously degrade the potential of the observational data to constrain the model.

[1]  A. Payne,et al.  The Glimmer community ice sheet model , 2009 .

[2]  F. Pukelsheim The Three Sigma Rule , 1994 .

[3]  Michael Goldstein,et al.  Bayesian Forecasting for Complex Systems Using Computer Simulators , 2001 .

[4]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[5]  Sivaprasad Gogineni,et al.  A new ice thickness and bed data set for the Greenland ice sheet: 1. Measurement, data reduction, and errors , 2001 .

[6]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[7]  Robin K. S. Hankin,et al.  Introducing BACCO, an R Bundle for Bayesian Analysis of Computer Code Output , 2005 .

[8]  Reto Knutti,et al.  The use of the multi-model ensemble in probabilistic climate projections , 2007, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[9]  A. O'Hagan,et al.  Probabilistic sensitivity analysis of complex models: a Bayesian approach , 2004 .

[10]  Richard D. Wilkinson,et al.  Bayesian Calibration of Expensive Multivariate Computer Experiments , 2010 .

[11]  Ian Vernon,et al.  Galaxy formation : a Bayesian uncertainty analysis. , 2010 .

[12]  R. Hankin Introducing BACCO , an R package for Bayesian analysis of computer code output , 2009 .

[13]  A. Seheult,et al.  Pressure Matching for Hydrocarbon Reservoirs: A Case Study in the Use of Bayes Linear Strategies for Large Computer Experiments , 1997 .

[14]  Jasper A. Vrugt,et al.  Inverse modelling of cloud-aerosol interactions – Part 2: Sensitivity tests on liquid phase clouds using a Markov chain Monte Carlo based simulation approach , 2011 .

[15]  Marko Scholze,et al.  On the capability of Monte Carlo and adjoint inversion techniques to derive posterior parameter uncertainties in terrestrial ecosystem models , 2012 .

[16]  M. Webb,et al.  Multivariate probabilistic projections using imperfect climate models part I: outline of methodology , 2012, Climate Dynamics.

[17]  Antony J. Payne,et al.  A thermomechanical model of ice flow in West Antarctica , 1999 .

[18]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[19]  David M. H. Sexton,et al.  Multivariate probabilistic projections using imperfect climate models. Part II: robustness of methodological choices and consequences for climate sensitivity , 2012, Climate Dynamics.

[20]  Jonathan Rougier,et al.  Probabilistic Inference for Future Climate Using an Ensemble of Climate Model Evaluations , 2007 .

[21]  I. C. Rutt,et al.  Investigating the sensitivity of numerical model simulations of the modern state of the Greenland ice-sheet and its future response to climate change , 2010 .

[22]  Thomas J. Santner,et al.  Design and analysis of computer experiments , 1998 .