Estimating Spatial Intensity and Variation in Risk from Locations Subject to Geocoding Errors

The accurate assignment of geocodes to the residences of subjects in a study population is an important component of the data acquisition/assimilation stage of a spatial epidemiological investigation. Unfortunately, however, it is not a simple matter to obtain accurate point-level geocodes. Recent investigations have demonstrated that when residential address geocoding is performed by the most common method of street-segment matching to a georeferenced road file and subsequent interpolation, positional errors of hundreds of meters are commonplace, especially in rural locations. Ignoring these errors in a statistical analysis may lead to biased estimators, a reduction in power, and incorrect conclusions. This article modifies some existing likelihood-based procedures for estimating the intensity and relative risk of Poisson spatial point processes from locations ascertained without error, so as to permit valid inferences to be made from locations observed with error. The superior performance of the modified methods compared to methods that ignore positional errors is demonstrated by simulation.

[1]  N. Cressie,et al.  Spatial Statistics in the Presence of Location Error with an Application to Remote Sensing of the Environment , 2003 .

[2]  P. Diggle A point process modeling approach to raised incidence of a rare phenomenon in the vicinity of a prespecified point , 1990 .

[3]  Dale Zimmerman,et al.  Statistical Methods for Incompletely and Incorrectly Geocoded Cancer Data , 2007 .

[4]  Lance A. Waller,et al.  The Effect of Uncertain Locations on Disease Cluster Statistics , 2008 .

[5]  Thomas O Talbot,et al.  Positional error in automated geocoding of residential addresses , 2003, International journal of health geographics.

[6]  Richard D. Mrozinski,et al.  Subject loss in spatial analysis of breast cancer. , 1999, Health & place.

[7]  Joanne S Colt,et al.  Positional Accuracy of Two Methods of Geocoding , 2005, Epidemiology.

[8]  Gerard Rushton,et al.  Modeling the probability distribution of positional errors incurred by residential address geocoding , 2007 .

[9]  S. Dearwent,et al.  Locational uncertainty in georeferencing public health datasets , 2001, Journal of Exposure Analysis and Environmental Epidemiology.

[10]  L A Waller Statistical power and design of focused clustering studies. , 1996, Statistics in medicine.

[11]  R. Carroll,et al.  Deconvolving kernel density estimators , 1987 .

[12]  Jing Nie,et al.  Positional Accuracy of Geocoded Addresses in Epidemiologic Research , 2003, Epidemiology.

[13]  G M Jacquez Disease cluster statistics for imprecise space-time locations. , 1996, Statistics in medicine.

[14]  L. Pickle,et al.  Geographic bias related to geocoding in epidemiologic studies , 2005, International journal of health geographics.

[15]  Andrew B. Lawson,et al.  Statistical Methods in Spatial Epidemiology , 2001 .

[16]  Francis P. Boscoe The Science and Art of Geocoding: Tips for Improving Match Rates and Handling Unmatched Cases in Analysis , 2007 .

[17]  P. Diggle Applied Spatial Statistics for Public Health Data , 2005 .

[18]  Jürgen Symanzik,et al.  Statistical Analysis of Spatial Point Patterns , 2005, Technometrics.

[19]  J W Hogan,et al.  On the wrong side of the tracts? Evaluating the accuracy of geocoding in public health research. , 2001, American journal of public health.

[20]  Dale L. Zimmerman,et al.  Estimating Spatial Intensity and Variation in Risk from Locations Coarsened by Incomplete Geocoding , 2006 .

[21]  Nataliya Kravets,et al.  The accuracy of address coding and the effects of coding errors. , 2007, Health & place.

[22]  John Noel A. C Gabrosek,et al.  The Effect on Attribute Prediction of Location Uncertainty in Spatial Data , 2002 .

[23]  Richard L. Smith,et al.  Accuracy of commercial geocoding: assessment and implications , 2006, Epidemiologic perspectives & innovations : EP+I.

[24]  Peter J. Diggle,et al.  A Conditional Approach to Point Process Modelling of Elevated Risk , 1994 .

[25]  Amy Trentham-Dietz,et al.  Geocoding Addresses from a Large Population-based Study: Lessons Learned , 2003, Epidemiology.

[26]  Pierre Goovaerts,et al.  Global, local and focused geographic clustering for case-control data with residential histories , 2005, Environmental health : a global access science source.

[27]  Michael Jerrett,et al.  Conceptual and practical issues in the detection of local disease clusters: a study of mortality in Hamilton, Ontario , 2002 .

[28]  G M Jacquez,et al.  Cuzick and Edwards' test when exact locations are unknown. , 1994, American journal of epidemiology.