Bayesian inference of a lake water quality model by emulating its posterior density

We use a Gaussian stochastic process emulator to interpolate the posterior probability density of a computationally demanding application of the biogeochemical-ecological lake model BELAMO to accelerate statistical inference of deterministic model and error model parameters. The deterministic model consists of a mechanistic description of key processes influencing the mass balance of nutrients, dissolved oxygen, organic particles, and phytoplankton and zooplankton in the lake. This model is complemented by a Gaussian stochastic process to describe the remaining model bias and by Normal, independent observation errors. A small subsample of the Markov chain representing the posterior of the model parameters is propagated through the full model to get model predictions and uncertainty estimates. We expect this approximation to be more accurate at only slightly higher computational costs compared to using a Normal approximation to the posterior probability density and linear error propagation to the results as we did in an earlier paper. The performance of the two techniques is compared for a didactical example as well as for the lake model. As expected, for the didactical example, the use of the emulator led to posterior marginals of the model parameters that are closer to those calculated by Markov chain simulation using the full model than those based on the Normal approximation. For the lake model, the new technique proved applicable without an excessive increase in computational requirements, but we faced challenges in the choice of the design data set for emulator calibration. As the posterior is a scalar function of the parameters, the suggested technique is an alternative to the emulation of a potentially more complex, structured output of the simulation model that allows for the use of a less case-specific emulator. This is at the cost that still the full model has to be used for prediction (which can be done with a smaller, approximately independent subsample of the Markov chain).

[1]  Noel A. C. Cressie,et al.  Statistics for Spatial Data: Cressie/Statistics , 1993 .

[2]  P. Reichert,et al.  Modelling functional groups of phytoplankton in three lakes of different trophic state , 2008 .

[3]  P. Reichert,et al.  Effects of changes in the driving forces on water quality and plankton dynamics in three Swiss lakes - long-term simulations with BELAMO , 2013 .

[4]  Shiyu Zhou,et al.  A Simple Approach to Emulation for Computer Models With Qualitative and Quantitative Factors , 2011, Technometrics.

[5]  P. Reichert,et al.  Biogeochemical model of Lake Zürich : sensitivity, identifiability and uncertainty analysis , 2001 .

[6]  A. E. Irish,et al.  The ecological basis for simulating phytoplankton responses to environmental change (PROTECH) , 2001 .

[7]  Andrea Castelletti,et al.  A general framework for Dynamic Emulation Modelling in environmental problems , 2012, Environ. Model. Softw..

[8]  Helmut Z. Baumert,et al.  Einführung in das GETAS Projekt , 2005 .

[9]  Max D. Morris,et al.  Gaussian Surrogates for Computer Models With Time-Varying Inputs and Outputs , 2012, Technometrics.

[10]  David P. Hamilton,et al.  Prediction of water quality in lakes and reservoirs: Part II - Model calibration, sensitivity analysis and application , 1997 .

[11]  D. Higdon,et al.  Computer Model Calibration Using High-Dimensional Output , 2008 .

[12]  J. Elliott,et al.  Testing the Sensitivity of Phytoplankton Communities to Changes in Water Temperature and Nutrient Load, in a Temperate Lake , 2006, Hydrobiologia.

[13]  A. E. Irish,et al.  Modelling Phytoplankton Dynamics in Fresh Waters: Affirmation of the PROTECH Approach to Simulation , 2010 .

[14]  A. E. Irish,et al.  Modelling freshwater phytoplankton communities: an exercise in validation , 2000 .

[15]  M. J. Bayarri,et al.  Computer model validation with functional output , 2007, 0711.3271.

[16]  P. Reichert,et al.  Biogeochemical model of Lake Zurich: model equations and results , 2001 .

[17]  J. Benndorf,et al.  Problems of application of the ecological model salmo to lakes and reservoirs having various trophic states , 1982 .

[18]  Christian Stamm,et al.  Integrated uncertainty assessment of discharge predictions with a statistical error model , 2013 .

[19]  Michael A. West,et al.  A dynamic modelling strategy for Bayesian computer model emulation , 2009 .

[20]  James O. Berger,et al.  A Framework for Validation of Computer Models , 2007, Technometrics.

[21]  David P. Hamilton,et al.  A numerical simulation of the role of zooplankton in C, N and P cycling in Lake Kinneret, Israel , 2006 .

[22]  Dave Higdon,et al.  Combining Field Data and Computer Simulations for Calibration and Prediction , 2005, SIAM J. Sci. Comput..

[23]  Anthony O'Hagan,et al.  Diagnostics for Gaussian Process Emulators , 2009, Technometrics.

[24]  C. Albert,et al.  A mechanistic dynamic emulator , 2011, 1112.5304.

[25]  A. E. Irish,et al.  Exploring the potential of the PROTECH model to investigate phytoplankton community theory , 1999, Hydrobiologia.

[26]  Sonja Kuhnt,et al.  Design and analysis of computer experiments , 2010 .

[27]  Björn Bornkamp,et al.  Approximating Probability Densities by Iterated Laplace Approximations , 2011, 1103.3508.

[28]  Andreas Scheidegger,et al.  Improving uncertainty estimation in urban hydrological modeling by statistically describing bias , 2013 .

[29]  Thomas J. Santner,et al.  The Design and Analysis of Computer Experiments , 2003, Springer Series in Statistics.

[30]  P. Reichert,et al.  Analysis of the transferability of a biogeochemical lake model to lakes of different trophic state , 2006 .

[31]  Gonzalo García-Donato,et al.  Calibration of computer models with multivariate output , 2012, Comput. Stat. Data Anal..

[32]  K. Rinke,et al.  Simulating phytoplankton community dynamics in Lake Constance with a coupled hydrodynamic-ecological model , 2009 .

[33]  A. O'Hagan,et al.  Gaussian process emulation of dynamic computer codes , 2009 .

[34]  P. Reichert AQUASIM-a tool for simulation and data analysis of aquatic systems , 1994 .

[35]  E. Jeppesen,et al.  The Water Framework Directive: Setting the phosphorus loading target for a deep lake in Denmark using the 1D lake ecosystem model DYRESM–CAEDYM , 2008 .

[36]  M. Hinze,et al.  Proper Orthogonal Decomposition Surrogate Models for Nonlinear Dynamical Systems: Error Estimates and Suboptimal Control , 2005 .

[37]  David P. Hamilton,et al.  Challenges and opportunities for integrating lake ecosystem modelling approaches , 2010, Aquatic Ecology.

[38]  Barbara A. Adams-Vanharn,et al.  Evaluation of the current state of mechanistic aquatic biogeochemical modeling: citation analysis and future perspectives. , 2006, Environmental science & technology.

[39]  A. OHagan,et al.  Bayesian analysis of computer code outputs: A tutorial , 2006, Reliab. Eng. Syst. Saf..

[40]  P. Reichert,et al.  Linking statistical bias description to multiobjective model calibration , 2012 .

[41]  J. Antenucci,et al.  A multiobjective response surface approach for improved water quality planning in lakes and reservoirs , 2010 .

[42]  T. J. Mitchell,et al.  Bayesian Prediction of Deterministic Functions, with Applications to the Design and Analysis of Computer Experiments , 1991 .

[43]  David P. Hamilton,et al.  Prediction of water quality in lakes and reservoirs. Part I — Model description , 1997 .

[44]  J. Elliott,et al.  Phytoplankton modelling of Lake Erken, Sweden by linking the models PROBE and PROTECH , 2007 .

[45]  Thomas Petzoldt,et al.  SALMO: Die ökologische Komponente des gekoppelten Modells , 2005 .

[46]  S. Jørgensen A review of recent developments in lake modelling , 2010 .

[47]  E. Bruce Pitman,et al.  Computational Statistics and Data Analysis Mechanism-based Emulation of Dynamic Simulation Models: Concept and Application in Hydrology , 2022 .

[48]  J. Elliott,et al.  The simulation of phytoplankton in shallow and deep lakes using PROTECH , 2004 .

[49]  Peter Reichert,et al.  Calibration of computationally demanding and structurally uncertain models with an application to a lake water quality model , 2012, Environ. Model. Softw..

[50]  D. Sorensen,et al.  A Survey of Model Reduction Methods for Large-Scale Systems , 2000 .

[51]  A. O'Hagan,et al.  Bayesian emulation of complex multi-output and dynamic computer models , 2010 .

[52]  David P. Hamilton,et al.  Prediction of water quality in lakes and reservoirs. Part 3: Application to Prospect Reservoir , 1995 .

[53]  Runze Li,et al.  Analysis of Computer Experiments Using Penalized Likelihood in Gaussian Kriging Models , 2005, Technometrics.

[54]  Sourabh Bhattacharya,et al.  A simulation approach to Bayesian emulation of complex dynamic computer models , 2007 .

[55]  David P. Hamilton,et al.  Calibrating the Dynamic Reservoir Simulation Model (DYRESM) and filling required data gaps for one‐dimensional thermal profile predictions in a boreal lake , 2007 .

[56]  Peter C. Young,et al.  Statistical Emulation of Large Linear Dynamic Models , 2011, Technometrics.

[57]  J. R. Romero,et al.  One- and three-dimensional biogeochemical simulations of two differing reservoirs , 2004 .

[58]  Michael Goldstein,et al.  Bayesian Forecasting for Complex Systems Using Computer Simulators , 2001 .

[59]  D. Krige A statistical approach to some basic mine valuation problems on the Witwatersrand, by D.G. Krige, published in the Journal, December 1951 : introduction by the author , 1951 .

[60]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .

[61]  Richard G. Jones,et al.  Combining a regional climate model with a phytoplankton community model to predict future changes in phytoplankton in lakes , 2005 .

[62]  Jack P. C. Kleijnen,et al.  Kriging Metamodeling in Simulation: A Review , 2007, Eur. J. Oper. Res..

[63]  A. E. Irish,et al.  Sensitivity analysis of PROTECH, a new approach in phytoplankton modelling , 1999, Hydrobiologia.

[64]  Bryan A. Tolson,et al.  Review of surrogate modeling in water resources , 2012 .

[65]  J. Rougier Efficient Emulators for Multivariate Deterministic Functions , 2008 .