Positional Accuracy of Spatial Data: Non‐Normal Distributions and a Critique of the National Standard for Spatial Data Accuracy

Spatial data quality is a paramount concern in all GIS applications. Existing spatial data accuracy standards, including the National Standard for Spatial Data Accuracy (NSSDA) used in the United States, commonly assume the positional error of spatial data is normally distributed. This research has characterized the distribution of the positional error in four types of spatial data: GPS locations, street geocoding, TIGER roads, and LIDAR elevation data. The positional error in GPS locations can be approximated with a Rayleigh distribution, the positional error in street geocoding and TIGER roads can be approximated with a log-normal distribution, and the positional error in LIDAR elevation data can be approximated with a normal distribution of the original vertical error values after removal of a small number of outliers. For all four data types considered, however, these solutions are only approximations, and some evidence of non-stationary behavior resulting in lack of normality was observed in all four datasets. Monte-Carlo simulation of the robustness of accuracy statistics revealed that the conventional 100% Root Mean Square Error (RMSE) statistic is not reliable for non-normal distributions. Some degree of data trimming is recommended through the use of 90% and 95% RMSE statistics. Percentiles, however, are not very robust as single positional accuracy statistics. The non-normal distribution of positional errors in spatial data has implications for spatial data accuracy standards and error propagation modeling. Specific recommendations are formulated for revisions of the NSSDA.

[1]  J. Lindsay,et al.  Sensitivity of digital landscapes to artifact depressions in remotely-sensed DEMs , 2005 .

[2]  Olivier Bonin,et al.  Digital Terrain Model Computation from Contour Lines: How to Derive Quality Information from Artifact Analysis , 2005, GeoInformatica.

[3]  Athanasios Papoulis,et al.  Probability, Random Variables and Stochastic Processes , 1965 .

[4]  K. Holmesa,et al.  Error in a USGS 30-meter digital elevation model and its impact on terrain modeling , 2000 .

[5]  Jing Nie,et al.  Positional Accuracy of Geocoded Addresses in Epidemiologic Research , 2003, Epidemiology.

[6]  Charles Robert Ehlschlaeger The stochastic simulation approach : tools for representing spatial application uncertainty , 1998 .

[7]  Paul V. Bolstad,et al.  A Comparison of Autonomous, WAAS, Real-Time, and Post-Processed Global Positioning Systems (GPS) Accuracies in Northern Forests , 2005 .

[8]  H. Lilliefors On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown , 1967 .

[9]  Clyde R Greenwalt,et al.  PRINCIPLES OF ERROR THEORY AND CARTOGRAPHIC APPLICATIONS , 1962 .

[10]  Yee Leung,et al.  A Locational Error Model for Spatial Features , 1998, Int. J. Geogr. Inf. Sci..

[11]  Joanne S Colt,et al.  Positional Accuracy of Two Methods of Geocoding , 2005, Epidemiology.

[12]  H. Leon Harter,et al.  Circular Error Probabilities , 1960 .

[13]  Walter Modell,et al.  CRC Handbook of Tables for Probability and Statistics , 1966 .

[14]  P. Zandbergen Geocoding Quality and Implications for Spatial Analysis , 2009 .

[15]  Jun Wu,et al.  Improving Spatial Accuracy of Roadway Networks and Geocoded Addresses , 2005, Trans. GIS.

[16]  Jerry H. Ratcliffe,et al.  On the accuracy of TIGER-type geocoded address data in relation to cadastral and census areal units , 2001, Int. J. Geogr. Inf. Sci..

[17]  S. Dearwent,et al.  Locational uncertainty in georeferencing public health datasets , 2001, Journal of Exposure Analysis and Environmental Epidemiology.

[18]  D. H. Maling,et al.  Measurements from Maps: Principles and Methods of Cartometry , 1988 .

[19]  Timothy F. Trainor U.S. Census Bureau Geographic Support: A Response to Changing Technology and Improved Data , 2003 .

[20]  Atef Bel Hadj Ali Moment representation of polygons for the assessment of their shape quality , 2002, J. Geogr. Syst..

[21]  William H. Beyer Handbook of Tables for Probability and Statistics , 1967 .

[22]  P. Fisher,et al.  Modeling the effect of data errors on feature extraction from digital elevation models , 1992 .

[23]  Thomas O Talbot,et al.  Positional error in automated geocoding of residential addresses , 2003, International journal of health geographics.

[24]  MICHAEL F. GOODCHILD,et al.  A Simple Positional Accuracy Measure for Linear Features , 1997, Int. J. Geogr. Inf. Sci..

[25]  M. Hodgson,et al.  Accuracy of Airborne Lidar-Derived Elevation: Empirical Assessment and Error Budget , 2004 .

[26]  Tapani Sarjakoski,et al.  Uncovering the statistical and spatial characteristics of fine toposcale DEM error , 2006, Int. J. Geogr. Inf. Sci..

[27]  Håvard Tveite,et al.  An accuracy assessment method for geographical line data sets based on buffering , 1999, Int. J. Geogr. Inf. Sci..

[28]  Alan P. Vonderohe,et al.  TESTS TO ESTABLISH THE QUALITY OF DIGITAL CARTOGRAPHIC DATA; SOME EXAMPLES FROM THE DANE COUNTY LAND RECORDS PROJECT , 2008 .

[29]  J. Leroy Folks,et al.  Ideas of statistics , 1982 .

[30]  Matt Duckham,et al.  Assessment of error in digital vector data using fractal geometry , 2000, Int. J. Geogr. Inf. Sci..

[31]  François Vauglin Modèles statistiques des imprécisions géométriques des objets géographiques linéaires , 1997 .

[32]  Duanping Liao,et al.  Accuracy and repeatability of commercial geocoding. , 2004, American journal of epidemiology.

[33]  Hassan A. Karimi,et al.  Evaluation of Uncertainties Associated with Geocoding Techniques , 2004 .

[34]  Sidney C. Port,et al.  Probability, Random Variables, and Stochastic Processes—Second Edition (Athanasios Papoulis) , 1986 .

[35]  Carlos López,et al.  Improving the Elevation Accuracy of Digital Elevation Models: A Comparison of Some Error Detection Procedures , 2000, Trans. GIS.

[36]  H. Veregin The Effects of Vertical Error in Digital Elevation Models on the Determination of Flow-path Direction , 1997 .

[37]  Peter F. Fisher,et al.  Improved Modeling of Elevation Error with Geostatistics , 1998, GeoInformatica.

[38]  T. Sarjakoski,et al.  Error propagation analysis of DEM‐based drainage basin delineation , 2005 .

[39]  M. Wing,et al.  Consumer-Grade Global Positioning System (GPS) Accuracy and Reliability , 2005 .

[40]  Tim R. McVicar,et al.  Experimental evaluation of positional accuracy estimates from a linear network using point- and line-based testing methods , 2002, Int. J. Geogr. Inf. Sci..

[41]  Nicholas Chrisman,et al.  THE ERROR COMPONENT IN SPATIAL DATA , 2005 .

[42]  R. Olea Geostatistics for Natural Resources Evaluation By Pierre Goovaerts, Oxford University Press, Applied Geostatistics Series, 1997, 483 p., hardcover, $65 (U.S.), ISBN 0-19-511538-4 , 1999 .

[43]  P. Kyriakidis,et al.  Error in a USGS 30-meter digital elevation model and its impact on terrain modeling , 2000 .

[44]  J. Zhang,et al.  A Geostatistical Approach to Modelling Positional Errors in Vector Data , 2000, Trans. GIS.

[45]  J. A. Tullis,et al.  An Evaluation of Lidar-derived Elevation and Terrain Slope in Leaf-off Conditions , 2005 .

[46]  Wenzhong Shi,et al.  A stochastic process-based model for the positional error of line segments in GIS , 2000, Int. J. Geogr. Inf. Sci..

[47]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[48]  Paul V. Bolstad,et al.  Positional uncertainty in manually digitized map data , 1990, Int. J. Geogr. Inf. Sci..

[49]  P. Axelrad,et al.  Gps Error Analysis , 1996 .