Bayesian Calibration of Expensive Multivariate Computer Experiments

This chapter is concerned with how to calibrate a computer model to observational data when the model produces multivariate output and is expensive to run. The significance of considering models with long run times is that they can be run only at a limited number of different inputs, ruling out a brute-force Monte Carlo approach. Consequently, all inference must be done with a limited ensemble of model runs. In this chapter we use this ensemble to train a meta-model of the computer simulator, which we refer to as an emulator (Sacks et al. 1989). The emulator provides a probabilistic description of our beliefs about the computer model and can be used as a cheap surrogate for the simulator in the calibration process. For any input configuration not in the original ensemble of model runs, the emulator provides a probability distribution describing our uncertainty about the model’s output. The Bayesian approach to calibration of computer experiments using emulators was described by Kennedy and O’Hagan (2001). Their approach was for univariate computer models, and in this chapter we show how those methods can be extended to deal with multivariate models. We use principal component analysis to project the multivariate model output onto a lower dimensional space, and then use Gaussian processes to emulate the map from the input space to the lower dimensional space. We can then reconstruct from the subspace to the original data space. This gives a

[1]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[2]  Anthony O'Hagan,et al.  Diagnostics for Gaussian Process Emulators , 2009, Technometrics.

[3]  A. O'Hagan,et al.  Bayesian emulation of complex multi-output and dynamic computer models , 2010 .

[4]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[5]  Michael Goldstein,et al.  Reified Bayesian modelling and inference for physical systems , 2009 .

[6]  Katrin J. Meissner,et al.  The role of land surface dynamics in glacial inception: a study with the UVic Earth System Model , 2003 .

[7]  G. Casella An Introduction to Empirical Bayes Data Analysis , 1985 .

[8]  T. J. Mitchell,et al.  Exploratory designs for computational experiments , 1995 .

[9]  J. Rougier Efficient Emulators for Multivariate Deterministic Functions , 2008 .

[10]  Thomas J. Santner,et al.  Design and analysis of computer experiments , 1998 .

[11]  Jerome Sacks,et al.  Designs for Computer Experiments , 1989 .

[12]  D. Nychka,et al.  Covariance Tapering for Interpolation of Large Spatial Datasets , 2006 .

[13]  A. O'Hagan,et al.  Statistical Methods for Eliciting Probability Distributions , 2005 .

[14]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[15]  Russel E. Caflisch,et al.  Quasi-Random Sequences and Their Discrepancies , 1994, SIAM J. Sci. Comput..

[16]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[17]  D. Higdon,et al.  Computer Model Calibration Using High-Dimensional Output , 2008 .

[18]  J. Murphy,et al.  A methodology for probabilistic predictions of regional climate change from perturbed physics ensembles , 2007, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.

[19]  I. Jolliffe Principal Component Analysis , 2002 .

[20]  G. Box Science and Statistics , 1976 .

[21]  C. D. Keeling,et al.  Atmospheric CO 2 records from sites in the SIO air sampling network , 1994 .