Speed-up of posterior inference of highly-parameterized environmental models from a Kalman proposal distribution: DREAM(KZS)

Markov chain Monte Carlo (MCMC) simulation methods are widely used to generate samples from a target distribution. In posterior inference of highly-parameterized environmental models, the convergence speed of MCMC methods may be disturbingly low, even with the state-of-the-art algorithms, such as DREAM(ZS) (differential evolution adaptive Metropolis). At each iteration, DREAM(ZS) generates the proposal distributions with a mix of parallel direction jump and snooker jump that are only based on the information about the model parameters in the thinned chain history. In this study, to speed up the convergence of DREAM(ZS), we introduce a Kalman proposal distribution that utilizes the information contained in the covariance structure of the model parameters, the measurements and the model outputs. Compared with the parallel direction jump and the snooker jump, the Kalman jump can generate a more directional update of the model parameters. As the Kalman jump cannot maintain detailed balance, we restrict it only to the "burn-in" period and use the other two jumps with diminishing adaptation afterwards. The modified algorithm is called DREAM(KZS) as it uses the three jumps simultaneously with pre-defined probabilities. Numerical experiments demonstrate that DREAM(KZS) converges to the same posterior distribution as DREAM(ZS) but with much lower computational budget. Specifically, in problems with about 100 unknown model parameters, the saving can be as big as 20 times.

[1]  D. Mallants,et al.  Efficient posterior exploration of a high‐dimensional groundwater model from two‐stage Markov chain Monte Carlo simulation and polynomial chaos expansion , 2013 .

[2]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[3]  Arlen W. Harbaugh,et al.  MODFLOW-2000, The U.S. Geological Survey Modular Ground-Water Model - User Guide to Modularization Concepts and the Ground-Water Flow Process , 2000 .

[4]  Francis Sullivan,et al.  The Metropolis Algorithm , 2000, Computing in Science & Engineering.

[5]  Heikki Haario,et al.  Adaptive proposal distribution for random walk Metropolis algorithm , 1999, Comput. Stat..

[6]  Dmitri Kavetski,et al.  Pursuing the method of multiple working hypotheses for hydrological modeling , 2011 .

[7]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[8]  Steven J. Burian,et al.  Bayesian Approach for Uncertainty Analysis of an Urban Storm Water Model and Its Application to a Heavily Urbanized Watershed , 2013 .

[9]  G. Evensen Data Assimilation: The Ensemble Kalman Filter , 2006 .

[10]  Ming Ye,et al.  Maximum likelihood Bayesian averaging of spatial variability models in unsaturated fractured tuff , 2003 .

[11]  Ming Ye,et al.  Assessment of parametric uncertainty for groundwater reactive transport modeling , 2014 .

[12]  P. Binning,et al.  A new settling velocity model to describe secondary sedimentation. , 2014, Water research.

[13]  J. Schnoor Environmental Modeling: Fate and Transport of Pollutants in Water, Air, and Soil , 1996 .

[14]  Jasper A. Vrugt,et al.  UvA-DARE ( Digital Academic Repository ) DREAM ( D ) : An adaptive Markov chain Monte Carlo simulation algorithm to solve discrete , noncontinuous , and combinatorial posterior parameter estimation problems , 2011 .

[15]  J. Rosenthal,et al.  Optimal scaling of discrete approximations to Langevin diffusions , 1998 .

[16]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[17]  Chong Wang,et al.  Asymptotically Exact, Embarrassingly Parallel MCMC , 2013, UAI.

[18]  Heikki Haario,et al.  Efficient MCMC for Climate Model Parameter Estimation: Parallel Adaptive Chains and Early Rejection , 2012 .

[19]  Dong Wang,et al.  Assessing the pollution risk of a groundwater source field at western Laizhou Bay under seawater intrusion. , 2016, Environmental research.

[20]  Soroosh Sorooshian,et al.  Model Parameter Estimation Experiment (MOPEX): An overview of science strategy and major results from the second and third workshops , 2006 .

[21]  Jasper A. Vrugt,et al.  Markov chain Monte Carlo simulation using the DREAM software package: Theory, concepts, and MATLAB implementation , 2016, Environ. Model. Softw..

[22]  Johan Alexander Huisman,et al.  Integrated analysis of waveguide dispersed GPR pulses using deterministic and Bayesian inversion methods , 2012 .

[23]  Liangping Li,et al.  Inverse methods in hydrogeology: Evolution and recent trends , 2014 .

[24]  George Kuczera,et al.  Monte Carlo assessment of parameter uncertainty in conceptual catchment models: the Metropolis algorithm , 1998 .

[25]  Cajo J. F. ter Braak,et al.  Equifinality of formal (DREAM) and informal (GLUE) Bayesian approaches in hydrologic modeling? , 2009 .

[26]  Jasper A. Vrugt,et al.  Multiresponse multilayer vadose zone model calibration using Markov chain Monte Carlo simulation and field water retention data , 2011 .

[27]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[28]  Walter R. Gilks,et al.  Adaptive Direction Sampling , 1994 .

[29]  Yan Chen,et al.  Data assimilation for transient flow in geologic formations via ensemble Kalman filter , 2006 .

[30]  J. Rieckermann,et al.  Crowdsourcing Methods for Data Collection in Geophysics: State of the Art, Issues, and Future Directions , 2018, Reviews of Geophysics.

[31]  John Doherty,et al.  Approaches in Highly Parameterized Inversion: bgaPEST, a Bayesian Geostatistical Approach Implementation With PEST?Documentation and Instructions , 2014 .

[32]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[33]  George Kuczera,et al.  Toward a reliable decomposition of predictive uncertainty in hydrological modeling: Characterizing rainfall errors using conditional simulation , 2011 .

[34]  Ben Calderhead,et al.  A general construction for parallelizing Metropolis−Hastings algorithms , 2014, Proceedings of the National Academy of Sciences.

[35]  D. Higdon,et al.  Accelerating Markov Chain Monte Carlo Simulation by Differential Evolution with Self-Adaptive Randomized Subspace Sampling , 2009 .

[36]  Feng Liang,et al.  Quantifying model structural error: Efficient Bayesian calibration of a regional groundwater flow model using surrogates and a data‐driven error model , 2017 .

[37]  Dongxiao Zhang,et al.  An efficient, high-order perturbation approach for flow in random porous media via Karhunen-Loève and polynomial expansions , 2004 .

[38]  Lingzao Zeng,et al.  Bayesian inference for kinetic models of biotransformation using a generalized rate equation. , 2017, The Science of the total environment.

[39]  Cajo J. F. ter Braak,et al.  Differential Evolution Markov Chain with snooker updater and fewer chains , 2008, Stat. Comput..