Towards a soil information system for uncertain soil data

Abstract Understanding the limitations of soil data is essential for both managing environmental systems effectively and encouraging the responsible use of soil data. Explicit assessment of the uncertainties associated with soil data, and their storage in a soil database are therefore important. In practice, users will not want to separate ‘uncertain data’ from ‘certain data’ and will therefore require a single database that meets all the requirements of a conventional database, as well as the ability to handle uncertain data specifically. This chapter presents a framework that facilitates the storage of information about soil-data quality, including the uncertainties associated with soil data, in a conventional database design. It comprises a methodology for classifying data according to their attribute scale, which influences the structure of an uncertainty model, and their space–time variability, which determines the need for autocorrelation functions in describing uncertainty. In terms of the former, the key distinctions are among real numbers on a continuous domain, real or integer numbers on a discrete domain, categorical data and narrative data. In terms of the latter, the key distinctions are among data that are constant in space and time (e.g. universal constants); data that vary in time, but not in space; data that vary in space but not in time; and data that vary both in time and space. Thus, we distinguish 13 ‘data types’ to which individual datasets may be assigned. This simplifies the process of assessing uncertainties about soil data because characteristic uncertainty models can be defined for each data type. In general terms, an uncertain soil variable is completely specified by its probability distribution function (pdf). However, the complexity of a pdf varies with the 13 ‘data types’ identified. For example, the (cumulative) pdf of an uncertain numerical constant is simply a nondecreasing function on the real line. The database stores this function or some parameters of it, such as the mean and variance. Other data types are associated with more complex pdfs. For example, an uncertain categorical soil map requires the probability of each soil type occurring at any location to be defined (local uncertainty), as well as the spatial dependencies between these probabilities at multiple locations (spatial uncertainty). In practice, the complexity of the joint pdf will make it difficult or impossible to identify. Assumptions (such as a stationarity assumption) are therefore required to reduce the number of model parameters.

[1]  Edzer Pebesma,et al.  Spatial aggregation and soil process modelling , 1999 .

[2]  Phaedon C. Kyriakidis,et al.  A geostatistical approach for mapping thematic classification accuracy and evaluating the impact of inaccurate spatial data on ecological model predictions , 2001, Environmental and Ecological Statistics.

[3]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[4]  Gerard B. M. Heuvelink,et al.  Error Propagation in Environmental Modelling with GIS , 1998 .

[5]  R. Cooke Experts in Uncertainty: Opinion and Subjective Probability in Science , 1991 .

[6]  James D. Brown Knowledge, uncertainty and physical geography: towards the development of methodologies for questioning belief , 2004 .

[7]  Timothy C. Coburn,et al.  Geostatistics for Natural Resources Evaluation , 2000, Technometrics.

[8]  Pierre Goovaerts,et al.  Geostatistical modelling of uncertainty in soil science , 2001 .

[9]  Lars Rosén,et al.  On Modelling Discrete Geological Structures as Markov Random Fields , 2002 .

[10]  Patrick Bogaert,et al.  Continuous-valued map reconstruction with the Bayesian Maximum Entropy , 2003 .

[11]  Peter Caccetta,et al.  Image Fusion with Conditional Probability Networks for Monitoring the Salinization of Farmland , 1998, Digit. Signal Process..

[12]  Gerard B. M. Heuvelink,et al.  Combining soil maps with interpolations from point observations to predict quantitative soil properties , 1992 .

[13]  Jerome R. Ravetz,et al.  Uncertainty and Quality in Science for Policy , 1990 .

[14]  Bilal M. Ayyub,et al.  Elicitation of expert opinions for uncertainty and risks: Answer to the Book Review by Roger M. Cooke , 2003, Fuzzy Sets Syst..

[15]  G. Reinds,et al.  Quantification and simulation of errors in categorical data for uncertainty analysis of soil acidification modelling , 1999 .