Multimodel Bayesian analysis of data-worth applied to unsaturated fractured tuffs

To manage water resource and environmental systems effectively requires suitable data. The worth of collecting such data depends on their potential benefit and cost, including the expected cost (risk) of failing to take an appropriate decision. Evaluating this risk calls for a probabilistic approach to data-worth assessment. Recently we [39] developed a multimodel approach to optimum value-of-information or data-worth analysis based on model averaging within a maximum likelihood Bayesian framework. Adopting a two-dimensional synthetic example, we implemented our approach using Monte Carlo (MC) simulations with and without lead order approximations, finding that the former approach was almost equally accurate but computationally more efficient. Here we apply our methodology to pneumatic permeability data from vertical and inclined boreholes drilled into unsaturated fractured tuff near Superior, Arizona. In an attempt to improve computational efficiency, we introduce three new approximations that require less computational effort and compare results with those obtained by the original Monte Carlo method. The first approximation disregards uncertainty in model parameter estimates, the second does so for estimates of potential new data, and the third disregards both uncertainties. We find that only the first approximation yields reliable quantitative assessments of reductions in predictive uncertainty brought about by the collection of new data. We conclude that, whereas parameter uncertainty may sometimes be disregarded for purposes of analyzing data worth, the same does not generally apply to uncertainty in estimates of potential new data.

[1]  D. Madigan,et al.  Bayesian Model Averaging in Proportional Hazard Models: Assessing the Risk of a Stroke , 1997 .

[2]  A. Raftery Bayesian Model Selection in Social Research , 1995 .

[3]  Mario Chica-Olmo,et al.  Using Semivariogram Parameter Uncertainty in Hydrogeological Applications , 2009, Ground water.

[4]  Yoram Rubin,et al.  The concept of comparative information yield curves and its application to risk‐based site characterization , 2009 .

[5]  S. P. Neuman,et al.  Maximum likelihood Bayesian averaging of uncertain model predictions , 2003 .

[6]  Lars Rosén,et al.  An Outline of a Guidance Framework for Assessing Hydrogeological Risks at Early Stages , 1997 .

[7]  Keith Beven,et al.  The future of distributed models: model calibration and uncertainty prediction. , 1992 .

[8]  H. Akaike A new look at the statistical model identification , 1974 .

[9]  M. Sohn,et al.  Reducing uncertainty in site characterization using Bayes Monte Carlo methods. , 2000 .

[10]  Velimir V. Vesselinov,et al.  Maximum likelihood Bayesian averaging of airflow models in unsaturated fractured tuff using Occam and variance windows , 2010 .

[11]  Yoram Rubin,et al.  A risk‐driven approach for subsurface site characterization , 2008 .

[12]  Ming Ye,et al.  Comment on “Inverse groundwater modeling for hydraulic conductivity estimation using Bayesian model averaging and variance window” by Frank T.‐C. Tsai and Xiaobao Li , 2010 .

[13]  Ming Ye,et al.  Maximum likelihood Bayesian averaging of spatial variability models in unsaturated fractured tuff , 2003 .

[14]  Alan J. Rabideau,et al.  Decision Analysis for Pump‐and‐Treat Design , 2000 .

[15]  S. P. Neuman,et al.  On model selection criteria in multimodel analysis , 2007 .

[16]  S. P. Neuman,et al.  Estimation of Aquifer Parameters Under Transient and Steady State Conditions: 1. Maximum Likelihood Method Incorporating Prior Information , 1986 .

[17]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[18]  S. Gorelick,et al.  When enough is enough: The worth of monitoring data in aquifer remediation design , 1994 .

[19]  S. P. Neuman,et al.  Bayesian analysis of data-worth considering model and parameter uncertainties , 2012 .

[20]  Fumie Yokota,et al.  Value of Information Literature Analysis: A Review of Applications in Health Risk Management , 2004, Medical decision making : an international journal of the Society for Medical Decision Making.

[21]  Karim C. Abbaspour,et al.  A Bayesian approach for incorporating uncertainty and data worth in environmental projects , 1996 .

[22]  C. Tiedeman,et al.  Methods for using groundwater model predictions to guide hydrogeologic data collection, with application to the Death Valley regional groundwater flow system , 2003 .

[23]  Joel Massmann,et al.  Hydrogeological Decision Analysis: 4. The Concept of Data Worth and Its Use in the Development of Site Investigation Strategies , 1992 .

[24]  S. Sain,et al.  Combining climate model output via model correlations , 2010 .

[25]  Okke Batelaan,et al.  On the value of conditioning data to reduce conceptual model uncertainty in groundwater modeling , 2010 .

[26]  Y. Rubin,et al.  A Bayesian approach for inverse modeling, data assimilation, and conditional simulation of spatial random fields , 2010 .

[27]  Adrian E. Raftery,et al.  Accounting for Model Uncertainty in Survival Analysis Improves Predictive Performance , 1995 .

[28]  K. Beven,et al.  Bayesian Estimation of Uncertainty in Runoff Prediction and the Value of Data: An Application of the GLUE Approach , 1996 .

[29]  D. M. Ely,et al.  A method for evaluating the importance of system state observations to model predictions, with application to the Death Valley regional groundwater flow system , 2004 .

[30]  Clifford M. Hurvich,et al.  Regression and time series model selection in small samples , 1989 .

[31]  M. Willmann,et al.  Impact of log-transmissivity variogram structure on groundwater flow and transport predictions , 2009 .

[32]  Tom Fearn,et al.  What exactly is fitness for purpose in analytical measurement , 1996 .

[33]  Steven M. Gorelick,et al.  Framework to evaluate the worth of hydraulic conductivity data for optimal groundwater resources management in ecologically sensitive areas , 2005 .

[34]  Vittorio Di Federico,et al.  Scaling of random fields by means of truncated power variograms and associated spectra , 1997 .

[35]  Anders L. Madsen,et al.  Value of Information Analysis , 2013 .

[36]  Joel Massmann,et al.  Hydrogeological Decision Analysis: 1. A Framework , 1990 .

[37]  Brian J. Wagner,et al.  Evaluating data worth for ground-water management under uncertainty , 1999 .

[38]  Laura Toran,et al.  RISK-COST DECISION FRAMEWORK FOR AQUIFER REMEDIATION DESIGN , 1996 .

[39]  Peter J. Diggle,et al.  Bayesian Geostatistical Design , 2006 .

[40]  S. P. Neuman,et al.  estimation of spatial covariance structures by adjoint state maximum likelihood cross validation: 1. Theory , 1989 .

[41]  Mohammadali Tarrahi,et al.  Assessing the performance of the ensemble Kalman filter for subsurface flow data integration under variogram uncertainty , 2011 .

[42]  Clifford I. Voss,et al.  Multiobjective sampling design for parameter estimation and model discrimination in groundwater solute transport , 1989 .

[43]  Lucien Duckstein,et al.  Bayesian decision theory applied to design in hydrology , 1972 .

[44]  S. P. Neuman,et al.  Summary of air permeability data from single-hole injection tests in unsaturated fractured tuffs at the Apache Leap Research Site: Results of steady-state test interpretation , 1996 .

[45]  Ming Ye,et al.  Dependence of Bayesian Model Selection Criteria and Fisher Information Matrix on Sample Size , 2011 .

[46]  Keith Beven,et al.  Equifinality, data assimilation, and uncertainty estimation in mechanistic modelling of complex environmental systems using the GLUE methodology , 2001 .

[47]  George E. P. Box,et al.  Sampling and Bayes' inference in scientific modelling and robustness , 1980 .

[48]  David Draper,et al.  Assessment and Propagation of Model Uncertainty , 2011 .

[49]  R. Taplin Robust Likelihood Calculation for Time Series , 1993 .

[50]  Rangasami L. Kashyap,et al.  Optimal Choice of AR and MA Parts in Autoregressive Moving Average Models , 1982, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  R. M. Lark,et al.  Optimized Sample Schemes for Geostatistical Surveys , 2007 .

[52]  P. Kitanidis,et al.  Maximum likelihood parameter estimation of hydrologic spatial processes by the Gauss-Newton method , 1985 .

[53]  Y. Rubin,et al.  Bayesian geostatistical design: Task‐driven optimal site investigation when the geostatistical model is uncertain , 2010 .

[54]  S. Kesler,et al.  Pre-posterior analysis as a tool for data evaluation: Application to aquifer contamination , 1988 .

[55]  S. P. Neuman,et al.  Sensitivity analysis and assessment of prior model probabilities in MLBMA with application to unsaturated fractured tuff , 2005 .

[56]  R. Freeze,et al.  The worth of data in predicting aquitard continuity in hydrogeological design , 1993 .

[57]  Frank T.-C. Tsai,et al.  Inverse groundwater modeling for hydraulic conductivity estimation using Bayesian model averaging and variance window , 2008 .

[58]  Christian D Langevin,et al.  Quantifying Data Worth Toward Reducing Predictive Uncertainty , 2010, Ground water.

[59]  Pär-Erik Back,et al.  A model for estimating the value of sampling programs and the optimal number of samples for contaminated soil , 2007 .

[60]  Werner G. Müller,et al.  Collecting Spatial Data: Optimum Design of Experiments for Random Fields , 1998 .

[61]  D. Madigan,et al.  Correction to: ``Bayesian model averaging: a tutorial'' [Statist. Sci. 14 (1999), no. 4, 382--417; MR 2001a:62033] , 2000 .

[62]  Lars Rosén,et al.  Calculating the optimal number of contaminant samples by means of data worth analysis , 2006 .

[63]  Clayton V. Deutsch,et al.  GSLIB: Geostatistical Software Library and User's Guide , 1993 .

[64]  Charles A. Ingene,et al.  Specification Searches: Ad Hoc Inference with Nonexperimental Data , 1980 .

[65]  G. McNulty,et al.  Value of information analysis - Nevada test site , 1997 .