Geostatistical significance of differences for spatial subsurface phenomenon

A B S T R A C T Optimum subsurface development decision-making depends on uncertainty models that integrate all information, data, and interpretations. Failure to account for the spatial context of subsurface data may lead to naive and overconfident estimations of uncertainty. These assumptions negatively impact the optimality of subsurface development decision-making and, ultimately, the economics of subsurface projects. An essential component is determining the significance of observed local differences in spatial datasets. New and practical methods are required to assess the significance of the difference in local expectation of spatial measures. Then, to determine if the difference is geostatistically significant, we introduce significance measures that account for the spatial context. Demonstrations of applications with geostatistical significance cover three unique spatial applications. The demonstrations use realistic synthetic data to account for stationary (i.e., statistics of interest are invariant under translation) and geological trends of a spatial feature, porosity. All examples utilize random function fields constrained by the histogram, two-point semivariogram, local conditioning data, and expert-mapped secondary information. The proposed workflow performs spatial bootstrap to empirically assess the null distribution and uncertainty in the difference in means based on the spatial context for the locally unconditional and conditional cases. The resulting empirically derived uncertainty distributions are applied to calculate p-values for the spatial version of null hypothesis significance testing. Nevertheless, decisions based solely on p-values suffer from misinterpretation and lack of reproducibility. To address this problem, we propose an alternative workflow that computes the Bayes factor bound and estimation graphics to comprehensively analyze the results and provide evidence to support decision making. The suggested workflow for geostatistical significance supports other subsurface applications such as trend modeling, stress testing decisions of stationarity checking, and model accuracy against input spatial statistics checks. While presented for subsurface porosity modeling, the proposed workflow is compatible with a wide range of spatial/subsurface settings: soil erosion monitoring, groundwater contamination mitigation, sustainable forestry, mining grade control, design of well pads, etcetera. Furthermore, the proposed workflow offers an improved assessment of uncertainty and robust standards to conclude whether an observed difference over spatial sample data is meaningful.

[1]  M. Pyrcz,et al.  Robust Rule-Based Aggradational Lobe Reservoir Models , 2019, Natural Resources Research.

[2]  S. Dray,et al.  Generating spatially constrained null models for irregularly spaced data using Moran spectral randomization methods , 2015 .

[3]  E. S. Pearson,et al.  ON THE USE AND INTERPRETATION OF CERTAIN TEST CRITERIA FOR PURPOSES OF STATISTICAL INFERENCE PART I , 1928 .

[4]  Honggeun Jo,et al.  Conditioning well data to rule-based lobe model by machine learning with a generative adversarial network , 2020, Energy Exploration & Exploitation.

[5]  Jose D. Perezgonzalez,et al.  Fisher, Neyman-Pearson or NHST? A tutorial for teaching data testing , 2015, Front. Psychol..

[6]  C. Todd,et al.  Comparison of gridded sea surface temperature datasets for marine ecosystem studies , 2014 .

[7]  D. Curran‐Everett,et al.  The fickle P value generates irreproducible results , 2015, Nature Methods.

[8]  S. Goodman,et al.  Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations , 2016, European Journal of Epidemiology.

[9]  A. Tversky,et al.  BELIEF IN THE LAW OF SMALL NUMBERS , 1971, Pediatrics.

[10]  Regina Nuzzo,et al.  Scientific method: Statistical errors , 2014, Nature.

[11]  Robert R. Sokal,et al.  Approximate analysis of variance of spatially autocorrelated regional data , 1990 .

[12]  P. Legendre Spatial Autocorrelation: Trouble or New Paradigm? , 1993 .

[13]  Andrea Cerioli,et al.  MODIFIED TESTS OF INDEPENDENCE IN 2 X 2 TABLES WITH SPATIAL DATA , 1997 .

[14]  Robert Haining,et al.  Bayesian modelling of environmental risk: example using a small area ecological study of coronary heart disease mortality in relation to modelled outdoor nitrogen oxide levels , 2007 .

[15]  I. Cuthill,et al.  Effect size, confidence interval and statistical significance: a practical guide for biologists , 2007, Biological reviews of the Cambridge Philosophical Society.

[16]  Carsten F. Dormann,et al.  Less than eight (and a half) misconceptions of spatial analysis , 2012 .

[17]  B. Efron Bootstrap Methods: Another Look at the Jackknife , 1979 .

[18]  Hyungwon Choi,et al.  Moving beyond P values: data analysis with estimation graphics , 2019, Nature Methods.

[19]  R. Fisher,et al.  STATISTICAL METHODS AND SCIENTIFIC INDUCTION , 1955 .

[20]  B. Pelletier,et al.  Modified F tests for assessing the multiple correlation between one spatial process and several others , 2008 .

[21]  A. Journel,et al.  Resampling from stochastic simulations , 1994, Environmental and Ecological Statistics.

[22]  R. Dubin,et al.  Spatial Autocorrelation: A Primer , 1998 .

[23]  Jack J. Lennon,et al.  Red-shifts and red herrings in geographical ecology , 2000 .

[24]  Christopher J. Keylock,et al.  Hypothesis Testing for Nonlinear Phenomena in the Geosciences Using Synthetic, Surrogate Data , 2018, Earth and Space Science.

[25]  Carlo Ricotta,et al.  Random sampling does not exclude spatial dependence: The importance of neutral models for ecological hypothesis testing , 2007, Folia Geobotanica.

[26]  Bias‐Corrected Variance Estimation and Hypothesis Testing for Spatial Point and Marked Point Processes Using Subsampling , 2011, Biometrics.

[27]  D. McMillen,et al.  Estimation and Hypothesis Testing for Nonparametric Hedonic House Price Functions , 2010 .

[28]  J. Berger,et al.  Three Recommendations for Improving the Use of p-Values , 2019, The American Statistician.

[29]  R. Sokal,et al.  Testing for Regional Differences in Means: Distinguishing Inherent from Spurious Spatial Autocorrelation by Restricted Randomization , 2010 .

[30]  L. Halsey The reign of the p-value is over: what alternative analyses could we employ to fill the power vacuum? , 2019, Biology Letters.

[31]  P. Clifford,et al.  Modifying the t test for assessing the correlation between two spatial processes , 1993 .