An efficient Bayesian data-worth analysis using a multilevel Monte Carlo method

Abstract Improving the understanding of subsurface systems and thus reducing prediction uncertainty requires collection of data. As the collection of subsurface data is costly, it is important that the data collection scheme is cost-effective. Design of a cost-effective data collection scheme, i.e., data-worth analysis, requires quantifying model parameter, prediction, and both current and potential data uncertainties. Assessment of these uncertainties in large-scale stochastic subsurface hydrological model simulations using standard Monte Carlo (MC) sampling or surrogate modeling is extremely computationally intensive, sometimes even infeasible. In this work, we propose an efficient Bayesian data-worth analysis using a multilevel Monte Carlo (MLMC) method. Compared to the standard MC that requires a significantly large number of high-fidelity model executions to achieve a prescribed accuracy in estimating expectations, the MLMC can substantially reduce computational costs using multifidelity approximations. Since the Bayesian data-worth analysis involves a great deal of expectation estimation, the cost saving of the MLMC in the assessment can be outstanding. While the proposed MLMC-based data-worth analysis is broadly applicable, we use it for a highly heterogeneous two-phase subsurface flow simulation to select an optimal candidate data set that gives the largest uncertainty reduction in predicting mass flow rates at four production wells. The choices made by the MLMC estimation are validated by the actual measurements of the potential data, and consistent with the standard MC estimation. But compared to the standard MC, the MLMC greatly reduces the computational costs.

[1]  Alberto Guadagnini,et al.  Data-worth analysis through probabilistic collocation-based Ensemble Kalman Filter , 2016 .

[2]  G. Mahinthakumar,et al.  Modeling Subsurface Reactive Flows Using Leadership-Class Computing , 2009 .

[3]  Christian D Langevin,et al.  Quantifying Data Worth Toward Reducing Predictive Uncertainty , 2010, Ground water.

[4]  John Doherty,et al.  Using Prediction Uncertainty Analysis to Design Hydrologic Monitoring Networks: Example Applications from the Great Lakes Water Availability Pilot Project , 2014 .

[5]  Dongbin Xiu,et al.  A generalized polynomial chaos based ensemble Kalman filter with high accuracy , 2009, J. Comput. Phys..

[6]  Y. Rubin,et al.  A hypothesis‐driven approach to optimize field campaigns , 2012 .

[7]  Michael B. Giles,et al.  Multilevel Monte Carlo Path Simulation , 2008, Oper. Res..

[8]  S. Kesler,et al.  Pre-posterior analysis as a tool for data evaluation: Application to aquifer contamination , 1988 .

[9]  J. Gómez-Hernández,et al.  Upscaling hydraulic conductivities in heterogeneous media: An overview , 1996 .

[10]  W. Nowak,et al.  Application of FFT-based Algorithms for Large-Scale Universal Kriging Problems , 2009 .

[11]  Liangping Li,et al.  A comparative study of three-dimensional hydraulic conductivity upscaling at the macro-dispersion experiment (MADE) site, Columbus Air Force Base, Mississippi (USA) , 2011 .

[12]  Wolfgang Nowak,et al.  Bayesian assessment of the expected data impact on prediction confidence in optimal sampling design , 2012 .

[13]  J. Evans,et al.  Assessing the value of hydrogeologic information for risk‐based remedial action decisions , 1989 .

[14]  Alberto Guadagnini,et al.  Multimodel Bayesian analysis of groundwater data worth , 2014 .

[15]  Lingzao Zeng,et al.  ANOVA-based transformed probabilistic collocation method for Bayesian data-worth analysis , 2017 .

[16]  Lucien Duckstein,et al.  Bayesian decision theory applied to design in hydrology , 1972 .

[17]  Thomas Maddock,et al.  Management model as a tool for studying the worth of data , 1973 .

[18]  Yingqi Zhang,et al.  Least cost design of groundwater quality monitoring networks , 2005 .

[19]  K. A. Cliffe,et al.  Multilevel Monte Carlo methods and applications to elliptic PDEs with random coefficients , 2011, Comput. Vis. Sci..

[20]  Liangping Li,et al.  Transport upscaling using multi-rate mass transfer in three-dimensional highly heterogeneous porous media , 2011 .

[21]  S. P. Neuman,et al.  Multimodel Bayesian analysis of data-worth applied to unsaturated fractured tuffs , 2012 .

[22]  Wolfgang Nowak,et al.  Uncertainty and data worth analysis for the hydraulic design of funnel‐and‐gate systems in heterogeneous aquifers , 2004 .

[23]  Heng Li,et al.  Efficient data‐worth analysis for the selection of surveillance operation in a geologic CO 2 sequestration system , 2015 .

[24]  Lingzao Zeng,et al.  A stochastic collocation based Kalman filter for data assimilation , 2010 .

[25]  Michael Andrew Christie,et al.  Tenth SPE Comparative Solution Project: a comparison of upscaling techniques , 2001 .

[26]  Stefan Finsterle,et al.  Practical notes on local data‐worth analysis , 2015 .

[27]  C. Webster,et al.  An improved multilevel Monte Carlo method for estimating probability distribution functions in stochastic oil reservoir simulations , 2016 .

[28]  Ming Ye,et al.  An adaptive sparse‐grid high‐order stochastic collocation method for Bayesian inference in groundwater reactive transport modeling , 2013 .

[29]  Guannan Zhang,et al.  Hyperspherical Sparse Approximation Techniques for High-Dimensional Discontinuity Detection , 2016, SIAM Rev..

[30]  D. M. Ely,et al.  A method for evaluating the importance of system state observations to model predictions, with application to the Death Valley regional groundwater flow system , 2004 .