Using measurements close to a detection limit in a geostatistical case study to predict selenium concentration in topsoil

Data on environmental variables are subject to measurement error (ME), and it is important that this ME should be considered in any statistical analysis. Environmental datasets commonly consist of positive random variables that have skewed distributions. Measurements are then usually reported with a theoretical detection limit (DL); measurements less than this DL are deemed not to be statistically different from zero, and these data are then treated by setting them to an arbitrary value of half of the DL The skew of the data is dealt with by taking logarithms, and the geostatistical analysis performed for the transformed variable. The DL approach, however, is somewhat ad hoc, and in this paper we investigate an alternative approach to incorporate such measurements in a geostatistical analysis, namely Bayesian hierarchical modelling. This approach incorporates 'soft' data (i.e., imprecise information), and we use soft data to represent the information that each measurement provides. We can use this approach to combine a lognormal model to describe the spatial variability with a Gaussian model for the measurement error. We apply the methods to a dataset on the selenium (Se) concentration in the topsoil throughout the East Anglia region of the UK. We compare the maps of predictions produced by the approaches, and compare the methods based on their ability to predict the Se concentration and the associated uncertainty. We also consider how the geostatistical predictions might be used to aid the effective management of Se-deficient soils, and compare the methods based on the costs that might be incurred from the selected management strategies. We found that the Bayesian approach based on soft data resulted in smoother maps, reduced the errors of the predictions, and provided a better representation of the associated uncertainty. The cost resulting from Se-deficient soils was generally lower when we used the soft data approach, and we conclude that this provides a more effective and interpretable model for the data in this case study, and possibly for other environmental datasets with measurements close to a DL. (C) 2009 Elsevier B.V. All rights reserved.

[1]  R. Lark,et al.  The Bayesian maximum entropy method for lognormal variables , 2009 .

[2]  C. Roth Is Lognormal Kriging Suitable for Local Estimation? , 1998 .

[3]  J. N. Kapur,et al.  Entropy optimization principles with applications , 1992 .

[4]  Vera Pawlowsky-Glahn,et al.  Kriging Regionalized Positive Variables Revisited: Sample Space and Scale Considerations , 2007 .

[5]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[6]  R. Reese Geostatistics for Environmental Scientists , 2001 .

[7]  M. Fan,et al.  Historical changes in the concentrations of selenium in soil and wheat grain from the Broadbalk experiment over the last 160 years. , 2008, The Science of the total environment.

[8]  P. Filzmoser,et al.  Normal and lognormal data distribution in geochemistry: death of a myth. Consequences for the statistical treatment of geochemical and environmental data , 2000 .

[9]  Tang Jianan The Atlas of Endemic Diseases and Their Environments in the People's Republic of China , 1990 .

[10]  J. Berger,et al.  Objective Bayesian Analysis of Spatially Correlated Data , 2001 .

[11]  Gayle Woodside,et al.  Environmental, Safety, and Health Engineering , 1997 .

[12]  George Christakos,et al.  Modern Spatiotemporal Geostatistics , 2000 .

[13]  S. McGrath,et al.  Evidence of low selenium concentrations in UK bread-making wheat grain , 2002 .

[14]  R. M. Lark,et al.  Spatio-temporal variability of some metal concentrations in the soil of eastern England, and implications for soil monitoring. , 2006 .

[15]  P. Dixon,et al.  Data augmentation for a Bayesian spatial model involving censored observations , 2007 .

[16]  H. Godwin,et al.  Fenland: Its Ancient Past and Uncertain Future , 1979 .

[17]  Alice E. Milne,et al.  Using a process model and regression kriging to improve predictions of nitrous oxide emissions from soil , 2006 .

[18]  Alex B. McBratney,et al.  Estimation and implications of instrumental drift, random measurement error and nugget variance of soil attributes-a case study for soil pH. , 1990 .

[19]  David J. Spiegelhalter,et al.  WinBUGS user manual version 1.4 , 2003 .

[20]  Clayton V. Deutsch,et al.  GSLIB: Geostatistical Software Library and User's Guide , 1993 .

[21]  H. Jeffreys,et al.  The Theory of Probability , 1896 .

[22]  Andre G. Journel,et al.  The lognormal approach to predicting local distributions of selective mining unit grades , 1980 .

[23]  Victor De Oliveira,et al.  Bayesian Inference and Prediction of Gaussian Random Fields Based on Censored Data , 2005 .

[24]  M. Hawkesford,et al.  Strategies for increasing the selenium content of wheat , 2007 .