A National Prediction Model for PM2.5 Component Exposures and Measurement Error–Corrected Health Effect Inference

Background: Studies estimating health effects of long-term air pollution exposure often use a two-stage approach: building exposure models to assign individual-level exposures, which are then used in regression analyses. This requires accurate exposure modeling and careful treatment of exposure measurement error. Objective: To illustrate the importance of accounting for exposure model characteristics in two-stage air pollution studies, we considered a case study based on data from the Multi-Ethnic Study of Atherosclerosis (MESA). Methods: We built national spatial exposure models that used partial least squares and universal kriging to estimate annual average concentrations of four PM2.5 components: elemental carbon (EC), organic carbon (OC), silicon (Si), and sulfur (S). We predicted PM2.5 component exposures for the MESA cohort and estimated cross-sectional associations with carotid intima-media thickness (CIMT), adjusting for subject-specific covariates. We corrected for measurement error using recently developed methods that account for the spatial structure of predicted exposures. Results: Our models performed well, with cross-validated R2 values ranging from 0.62 to 0.95. Naïve analyses that did not account for measurement error indicated statistically significant associations between CIMT and exposure to OC, Si, and S. EC and OC exhibited little spatial correlation, and the corrected inference was unchanged from the naïve analysis. The Si and S exposure surfaces displayed notable spatial correlation, resulting in corrected confidence intervals (CIs) that were 50% wider than the naïve CIs, but that were still statistically significant. Conclusion: The impact of correcting for measurement error on health effect inference is concordant with the degree of spatial correlation in the exposure surfaces. Exposure model characteristics must be considered when performing two-stage air pollution epidemiologic analyses because naïve health effect inference may be inappropriate. Citation: Bergen S, Sheppard L, Sampson PD, Kim SY, Richards M, Vedal S, Kaufman JD, Szpiro AA. 2013. A national prediction model for PM2.5 component exposures and measurement error–corrected health effect inference. Environ Health Perspect 121:1017–1025; http://dx.doi.org/10.1289/ehp.1206010

[1]  R. Kronmal,et al.  Multi-Ethnic Study of Atherosclerosis: objectives and design. , 2002, American journal of epidemiology.

[2]  J. Gulliver,et al.  A review of land-use regression models to assess spatial variation of outdoor air pollution , 2008 .

[3]  Johan Lindström,et al.  Comparing universal kriging and land-use regression for predicting concentrations of gaseous oxides of nitrogen (NOx) for the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). , 2011, Atmospheric environment.

[4]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[5]  A. Peters,et al.  Particulate Matter Air Pollution and Cardiovascular Disease: An Update to the Scientific Statement From the American Heart Association , 2010, Circulation.

[6]  F. Dominici,et al.  Fine particulate air pollution and mortality in 20 U.S. cities, 1987-1994. , 2000, The New England journal of medicine.

[7]  Ho Kim,et al.  Health Effects of Long-term Air Pollution: Influence of Exposure Prediction Methods , 2009, Epidemiology.

[8]  J. Franklin,et al.  The elements of statistical learning: data mining, inference and prediction , 2005 .

[9]  P. Sampson,et al.  Pragmatic Estimation of a Spatio-Temporal Air Quality Model With Irregular Monitoring Data , 2011 .

[10]  Sun-Young Kim,et al.  A National Model Built with Partial Least Squares and Universal Kriging and Bootstrap-based Measurement Error Correction Techniques: An Application to the Multi-Ethnic Study of Atherosclerosis , 2012 .

[11]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[12]  Mark Richards,et al.  A regionalized national universal kriging model using Partial Least Squares regression for estimating annual PM2.5 concentrations in epidemiology. , 2013, Atmospheric environment.

[13]  Lianne Sheppard,et al.  Efficient measurement error correction with spatially misaligned data. , 2011, Biostatistics.

[14]  L. Sheppard,et al.  Long-term exposure to air pollution and incidence of cardiovascular events in women. , 2007, The New England journal of medicine.

[15]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[16]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[17]  Thomas Lumley,et al.  Prospective study of particulate air pollution exposures, subclinical atherosclerosis, and clinical cardiovascular disease: The Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). , 2012, American journal of epidemiology.

[18]  R. Burnett,et al.  Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. , 2002, JAMA.

[19]  Lianne Sheppard,et al.  Does more accurate exposure prediction necessarily improve health effect estimates? , 2011, Epidemiology.

[20]  J. Sarnat,et al.  Fine particulate air pollution and mortality in 20 U.S. cities. , 2001, The New England journal of medicine.

[21]  J. R. Cook,et al.  Simulation-Extrapolation: The Measurement Error Jackknife , 1995 .