Positional Accuracy of Geocoded Addresses in Epidemiologic Research

Background Geographic information systems (GIS) offer powerful techniques for epidemiologists. Geocoding is an important step in the use of GIS in epidemiologic research, and the validity of epidemiologic studies using this methodology depends, in part, on the positional accuracy of the geocoding process. Methods We conducted a study comparing the validity of positions geocoded with a commercially available program to positions determined by Global Positioning System (GPS) satellite receivers. Addresses (N = 200) were randomly selected from a recently completed case–control study in Western New York State. We geocoded addresses using ArcView 3.2 on the GDT Dynamap/2000 U.S. Street database. In addition, we measured the longitude and latitude of these addresses with a GPS receiver. The distance between the locations obtained by these two methods was calculated for all addresses. Results The distance between the geocoded point and the GPS point was within 100 m for the majority of subject addresses (79%), with only a small proportion (3%) having a distance greater than 800 m. The overall median distance between GPS points and geocoded points was 38 m (90% confidence interval [CI] = 34–46). Distances were not different for cases and controls. Urban addresses (median = 32 m; CI = 28–37) were slightly more accurate than nonurban addresses (median = 52 m; CI = 44–61). Conclusions. This study indicates that the suitability of geocoding for epidemiologic research depends on the level of spatial resolution required to assess exposure. Although sources of error in positional accuracy for geocoded addresses exist, geocoding of addresses is, for the most part, very accurate.

[1]  Carol Hanchette,et al.  Geographic information systems: their use in environmental epidemiologic research. , 1997, Environmental Health Perspectives.

[2]  B. Hofmann-Wellenhof,et al.  Global Positioning System , 1992 .

[3]  S Selvin,et al.  Maternal Residential Proximity to Hazardous Waste Sites and Risk for Selected Congenital Malformations , 1997, Epidemiology.

[4]  William J. Drummond,et al.  Address Matching: GIS Technology for Mapping Human Activity Patterns , 1995 .

[5]  E. Aileen Clarke,et al.  Childhood leukemia in the vicinity of Canadian nuclear facilities , 2004, Cancer Causes & Control.

[6]  Howard Frumkin,et al.  Residential proximity to electricity transmission and distribution equipment and risk of childhood leukemia, childhood lymphoma, and childhood nervous system tumors: systematic review, evaluation, and meta-analysis , 1994, Cancer Causes & Control.

[7]  Paul Neuhaus,et al.  United States Bureau of the Census , 1998 .

[8]  H L Howe Geocoding NY State Cancer Registry. , 1986, American journal of public health.

[9]  T. Carpenter,et al.  Spatial analytical methods and geographic information systems: use in health research and epidemiology. , 1999, Epidemiologic reviews.

[10]  J W Hogan,et al.  On the wrong side of the tracts? Evaluating the accuracy of geocoding in public health research. , 2001, American journal of public health.

[11]  G. Pershagen,et al.  Using geographic information systems to assess individual historical exposure to air pollution from traffic and house heating in Stockholm. , 2001, Environmental health perspectives.