Geocoding Addresses from a Large Population-based Study: Lessons Learned

Background Geographic information systems (GIS) and spatial statistics are useful for exploring the relation between geographic location and health. The ultimate usefulness of GIS depends on both completeness and accuracy of geocoding (the process of assigning study participants’ residences latitude/longitude coordinates that closely approximate their true locations, also known as address matching). The goal of this project was to develop an iterative geocoding process that would achieve a high match rate in a large population-based health study. Methods Data were from a study conducted in Wisconsin using mailing addresses of participants who were interviewed by telephone from 1988 to 1995. We standardized the addresses according to US Postal Service guidelines, used desktop GIS geocoding software and two versions of the Topologically Integrated Geographic Encoding and Referencing street maps, accessed Internet mapping engines for problematic addresses, and recontacted a small number of study participants’ households. We also tabulated the project’s cost, time commitment, software requirements, and brief notes for each step and their alternatives. Results Of the 14,804 participants, 97% were ultimately assigned latitude/longitude coordinates corresponding to their respective residences. The remaining 3% were geocoded to their zip code centroid. Conclusion The multiple methods described in this work provide practical information for investigators who are considering the use of GIS in their population health research.

[1]  Florence M. Margai,et al.  A community-based assessment of learning disabilities using environmental and contextual risk factors. , 2003, Social science & medicine.

[2]  Jarvis T. Chen,et al.  Geocoding and monitoring of US socioeconomic inequalities in mortality and cancer incidence: does the choice of area-based measure and geographic level matter?: the Public Health Disparities Geocoding Project. , 2002, American journal of epidemiology.

[3]  S V Subramanian,et al.  Zip code caveat: bias due to spatiotemporal mismatches between zip codes and US census-defined geographic areas--the Public Health Disparities Geocoding Project. , 2002, American journal of public health.

[4]  S. Cummins,et al.  Place effects on health: how can we conceptualise, operationalise and measure them? , 2002, Social science & medicine.

[5]  C. Paulu,et al.  Exploring associations between residential location and breast cancer incidence in a case-control study. , 2002, Environmental health perspectives.

[6]  Don Taylor,et al.  Small area analysis on a large scale--the California experience in mapping teenage birth "hot spots" for resource allocation. , 2002, Journal of public health management and practice : JPHMP.

[7]  L. Chambless,et al.  Neighborhood of residence and incidence of coronary heart disease. , 2001, The New England journal of medicine.

[8]  J W Hogan,et al.  On the wrong side of the tracts? Evaluating the accuracy of geocoding in public health research. , 2001, American journal of public health.

[9]  D. Makuc,et al.  Urban and rural health chartbook , 2001 .

[10]  M Kulldorff,et al.  Geographic assessment of breast cancer screening by towns, zip codes, and census tracts. , 2000, Journal of public health management and practice : JPHMP.

[11]  O. Löfman,et al.  Childhood leukaemia in areas with different radon levels: a spatial and temporal analysis using GIS , 2000, Journal of epidemiology and community health.

[12]  L. Stallones,et al.  A geographic information assessment of birth weight and crop production patterns around mother's residence. , 2000, Environmental research.

[13]  L. Lang GIS for health organizations , 2000 .

[14]  Sources of spatial data for community health planning. , 1999 .

[15]  M. Rogers,et al.  Getting started with Geographic Information Systems (GIS): a local health department perspective. , 1999, Journal of public health management and practice : JPHMP.

[16]  T. Richards,et al.  Geographic information systems and public health: mapping the future. , 1999, Public health reports.

[17]  S. McLafferty,et al.  Your first mapping project on your own: from A to Z. , 1999, Journal of public health management and practice : JPHMP.

[18]  M F MacDorman,et al.  State initiatives in geocoding vital statistics data. , 1999, Journal of public health management and practice : JPHMP.

[19]  G Rushton Methods to evaluate geographic access to health services. , 1999, Journal of public health management and practice : JPHMP.

[20]  S. E. Thrall,et al.  Geographic information system (GIS) hardware and software. , 1999, Journal of public health management and practice : JPHMP.

[21]  S. Hwang,et al.  Assessing environmental exposure to PCBs among Mohawks at Akwesasne through the use of geostatistical methods. , 1999, Environmental research.

[22]  M. Szklo,et al.  Neighbourhood differences in diet: the Atherosclerosis Risk in Communities (ARIC) Study. , 1999, Journal of epidemiology and community health.

[23]  J. Irving,et al.  Sources of spatial data for community health planning. , 1999, Journal of public health management and practice : JPHMP.

[24]  W. Willett,et al.  Lactation in relation to postmenopausal breast cancer. , 1999, American journal of epidemiology.

[25]  M. Cetron,et al.  Geocoding and linking data from population-based surveillance and the US Census to evaluate the impact of median household income on the epidemiology of invasive Streptococcus pneumoniae infections. , 1998, American journal of epidemiology.

[26]  L. M. Timander,et al.  Breast cancer in West Islip, NY: a spatial clustering analysis with covariates. , 1998, Social science & medicine.

[27]  The multiple and changing faces of access. , 1998, Medical care.

[28]  Andrew P. Hull,et al.  The Relationship between Early Childhood Blood Lead Levels and Performance on End-of-Grade Tests , 2007, Environmental health perspectives.

[29]  R. Rudel,et al.  Mapping out a search for environmental causes of breast cancer. , 1996, Public health reports.

[30]  G Rushton,et al.  The spatial relationship between infant mortality and birth defect rates in a U.S. city. , 1996, Statistics in medicine.

[31]  G. Rushton,et al.  Exploratory spatial analysis of birth defect rates in an urban population. , 1996, Statistics in medicine.

[32]  William J. Drummond,et al.  Address Matching: GIS Technology for Mapping Human Activity Patterns , 1995 .

[33]  J. Fagliano,et al.  Drinking Water Contamination and the Incidence of Leukemia and Non-Hodgkin's Lymphoma. , 1994, Environmental health perspectives.

[34]  W. Willett,et al.  Lactation and a reduced risk of premenopausal breast cancer. , 1994, The New England journal of medicine.

[35]  L. Voigt,et al.  Qualitative and quantitative assessment of geographic clustering of population samples selected using different methods of random digit dialing. , 1990, American journal of epidemiology.

[36]  S Davis,et al.  Point pattern analysis of the spatial proximity of residences prior to diagnosis of persons with Hodgkin's disease. , 1990, American journal of epidemiology.

[37]  R. Uitti,et al.  Geography, Drinking Water Chemistry, Pesticides and Herbicides and the Etiology of Parkinson's Disease , 1987, Canadian Journal of Neurological Sciences / Journal Canadien des Sciences Neurologiques.

[38]  R. Marx The TIGER system: automating the geographic structure of the United States census. , 1986, Government publications review.

[39]  J. L. Harrison,et al.  The Government Printing Office , 1968, American Journal of Pharmaceutical Education.