Methods to Quantify Error Propagation and Prediction Uncertainty for USGS Raster Processing

Executive Summary Errors associated with geospatial data can propagate through natural-science (biologic, geographic, geologic, geospatial, and hydrologic) models that utilize raster processing, resulting in significant and spatially variable prediction uncertainty. This inherent prediction uncertainty affects how model results are interpreted by scientists, environmental regulators, resource managers, elected officials, and the general public. Frequently, USGS scientists use raster processing of geospatial data to create independent variables for empirical models, boundary conditions for mechanistic models, extrapolate beyond observed data (dependent variables) to make predictions for unobserved cases or spatial extents, and to make relatively simple, everyday calculations, such as defining depth to water using land-surface and water-table raster data sets. Yet, the propagation of input errors from geospatial data and resulting prediction uncertainty of raster-based models are rarely quantified. To maintain scientific leadership and provide the best available science, the USGS must address the following priority research questions: What role does the propagation of error from geospatial data during raster processing have on prediction uncertainty of USGS models and calculations? How can this prediction uncertainty be quantified in these USGS models? Can prediction uncertainty be minimized in future iterations of these USGS models? These priority research questions are addressed in the following proposal designed to develop and implement a stochastic-based method to identify the propagation of input errors from geospatial data during raster processing and to quantify the associated prediction uncertainty in USGS models. A novel ArcGIS tool will be developed that uses Latin Hypercube Sampling (a stratified stochastic approach similar to Monte Carlo analysis) to quantify error propagation and prediction uncertainty of geospatial models. As a demonstration example of the approach, utility, and high likelihood of success, the proposed method has been applied to a groundwater quality model of the High Plains aquifer. This application of the proposed method successfully demonstrates that spatially-variable prediction uncertainty of geospatial models can be quantified, and illustrates that errors can be evaluated to reduce this uncertainty in future iterations of the model. The demonstration example uses a common USGS hydrologic model, but the method and tool developed under this proposal would have cross-disciplinary applications for any biologic, To maintain scientific leadership and provide the best available science to the public and cooperators, USGS scientists must seek to present our data to the best of our ability and this includes estimates of and information on uncertainty of geospatial data and associated raster-based predictive models. A major …

[1]  P. Burrough,et al.  Principles of geographical information systems , 1998 .

[2]  Jason J. Gurdak,et al.  Percentage of Probability of Nonpoint-Source Nitrate Contamination of Recently Recharged Ground Water in the High Plains Aquifer , 2006 .

[3]  J. Hamerlinck,et al.  Determination of nonpoint-source pollution using GIS and numerical models , 1996 .

[4]  Carolyn T. Hunsaker,et al.  Spatial uncertainty in ecology : implications for remote sensing and GIS applications , 2002 .

[5]  Xia Li,et al.  Error Propagation and Model Uncertainties of Cellular Automata in Urban Simulation with GIS , 2003 .

[6]  Jason J. Gurdak,et al.  GIS and Statistical Groundwater Vulnerability Modeling , 2004 .

[7]  Abigail Holley,et al.  Palisade Corporation , 2005, WSC '05.

[8]  P. Pizor Principles of Geographical Information Systems for Land Resources Assessment. , 1987 .

[9]  Gerard B. M. Heuvelink,et al.  Error Propagation in Environmental Modelling with GIS , 1998 .

[10]  Sharon L. Qi,et al.  Classification of irrigated land using satellite imagery, the High Plains aquifer, nominal date 1992 , 2002 .

[11]  Edzer Pebesma,et al.  Uncertainties in spatially aggregated predictions from a logistic regression model , 2002 .

[12]  Gerard B. M. Heuvelink,et al.  Error Propagation in Cartographic Modelling Using Boolean Logic and Continuous Classification , 1993, Int. J. Geogr. Inf. Sci..

[13]  K. Lowell,et al.  Spatial Accuracy Assessment : Land Information Uncertainty in Natural Resources , 1999 .

[14]  Richard J. Beckman,et al.  A Comparison of Three Methods for Selecting Values of Input Variables in the Analysis of Output From a Computer Code , 2000, Technometrics.

[15]  Fred H. Sklar,et al.  The Use and Uncertainties of Spatial Data for Landscape Models: An Overview with Examples from the Florida Everglades , 2001 .

[16]  M. Goodchild,et al.  Uncertainty in geographical information , 2002 .

[17]  Sharon L. Qi,et al.  Vulnerability of recently recharged ground water in the High Plains aquifer to nitrate contamination , 2006 .

[18]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[19]  Donald L. Phillips,et al.  Spatial uncertainty analysis: propagation of interpolation errors in spatially distributed models , 1996 .

[20]  Gerard B. M. Heuvelink,et al.  Propagation of errors in spatial modelling with GIS , 1989, Int. J. Geogr. Inf. Sci..

[21]  Russell G. Congalton,et al.  Quantifying Spatial Uncertainty in Natural Resources: Theory and Applications for GIS and Remote Sensing , 2000 .