Concordance of Commercial Data Sources for Neighborhood-Effects Studies

Growing evidence supports a relationship between neighborhood-level characteristics and important health outcomes. One source of neighborhood data includes commercial databases integrated with geographic information systems to measure availability of certain types of businesses or destinations that may have either favorable or adverse effects on health outcomes; however, the quality of these data sources is generally unknown. This study assessed the concordance of two commercial databases for ascertaining the presence, locations, and characteristics of businesses. Businesses in the St. Louis, Missouri area were selected based on their four-digit Standard Industrial Classification (SIC) codes and classified into 14 business categories. Business listings in the two commercial databases were matched by standardized business name within specified distances. Concordance and coverage measures were calculated using capture–recapture methods for all businesses and by business type, with further stratification by census-tract-level population density, percent below poverty, and racial composition. For matched listings, distance between listings and agreement in four-digit SIC code, sales volume, and employee size were calculated. Overall, the percent agreement was 32% between the databases. Concordance and coverage estimates were lowest for health-care facilities and leisure/entertainment businesses; highest for popular walking destinations, eating places, and alcohol/tobacco establishments; and varied somewhat by population density. The mean distance (SD) between matched listings was 108.2 (179.0) m with varying levels of agreement in four-digit SIC (percent agreement = 84.6%), employee size (weighted kappa = 0.63), and sales volume (weighted kappa = 0.04). Researchers should cautiously interpret findings when using these commercial databases to yield measures of the neighborhood environment.

[1]  R R Regal,et al.  Capture-recapture methods in epidemiology: methods and limitations. , 1995, Epidemiologic reviews.

[2]  J. M. Oakes,et al.  Does Residential Density Increase Walking and Other Physical Activity? , 2007 .

[3]  Christine M. Hoehner,et al.  Measuring the built environment for physical activity: state of the science. , 2009, American journal of preventive medicine.

[4]  Ann Forsyth,et al.  Finding food: Issues and challenges in using Geographic Information Systems to measure food access. , 2010, Journal of transport and land use.

[5]  Ann Forsyth,et al.  Standards for Environmental Measurement Using GIS: Toward a Protocol for Protocols. , 2006, Journal of physical activity & health.

[6]  L. Gauvin,et al.  Field validation of listings of food stores and commercial physical activity establishments from secondary data , 2008, The international journal of behavioral nutrition and physical activity.

[7]  K. Tilling,et al.  Capture-recapture methods--useful or misleading? , 2001, International journal of epidemiology.

[8]  Ana V Diez Roux,et al.  Associations of the local food environment with diet quality--a comparison of assessments based on surveys and geographic information systems: the multi-ethnic study of atherosclerosis. , 2008, American journal of epidemiology.

[9]  A. Cheadle,et al.  Operational Definitions of Walkable Neighborhood: Theoretical and Empirical Insights. , 2006, Journal of physical activity & health.

[10]  L. Berkman,et al.  Neighborhood contextual influences on depressive symptoms in the elderly. , 2005, American journal of epidemiology.

[11]  Penny Gordon-Larsen,et al.  Validation of a GIS facilities database: quantification and implications of error. , 2008, Annals of epidemiology.

[12]  C. Caspersen,et al.  Distance between homes and exercise facilities related to frequency of exercise among San Diego residents. , 1990, Public health reports.

[13]  Barbara E Ainsworth,et al.  Considerations for Using a Geographic Information System to Assess Environmental Supports for Physical Activity , 2004, Preventing chronic disease.

[14]  D W Fleming,et al.  Modern geographic information systems--promise and pitfalls. , 1999, Journal of public health management and practice : JPHMP.

[15]  A Chao,et al.  The applications of capture‐recapture models to epidemiological data , 2001, Statistics in medicine.

[16]  Kelly J. Clifton,et al.  Evaluating neighborhood accessibility: possibilities and practicalities , 2001 .

[17]  Ronald E. LaPorte,et al.  Capture-recapture and multiple-record systems estimation II: Applications in human diseases. International Working Group for Disease Monitoring and Forecasting. , 1995, American journal of epidemiology.

[18]  Melissa C. Nelson,et al.  Neighborhood environments: disparities in access to healthy foods in the U.S. , 2009, American journal of preventive medicine.

[19]  Kelly R Evenson,et al.  Availability of recreational resources and physical activity in adults. , 2007, American journal of public health.

[20]  A. González,et al.  The neighborhood food environment: sources of historical data on retail food stores , 2006, The international journal of behavioral nutrition and physical activity.

[21]  Mario Schootman,et al.  The Role of Race and Poverty in Access to Foods That Enable Individuals to Adhere to Dietary Guidelines , 2006, Preventing chronic disease.

[22]  L. Berkman,et al.  Neighborhood effects on the self-rated health of elders: uncovering the relative importance of structural and service-related neighborhood environments. , 2006, The journals of gerontology. Series B, Psychological sciences and social sciences.

[23]  J. House,et al.  Measurement of the local food environment: a comparison of existing data sources. , 2010, American journal of epidemiology.

[24]  B. Popkin,et al.  Inequality in the Built Environment Underlies Key Health Disparities in Physical Activity and Obesity , 2006, Pediatrics.

[25]  L. Berkman,et al.  Neighborhoods and health , 2003 .