Measurement error caused by spatial misalignment in environmental epidemiology.

In many environmental epidemiology studies, the locations and/or times of exposure measurements and health assessments do not match. In such settings, health effects analyses often use the predictions from an exposure model as a covariate in a regression model. Such exposure predictions contain some measurement error as the predicted values do not equal the true exposures. We provide a framework for spatial measurement error modeling, showing that smoothing induces a Berkson-type measurement error with nondiagonal error structure. From this viewpoint, we review the existing approaches to estimation in a linear regression health model, including direct use of the spatial predictions and exposure simulation, and explore some modified approaches, including Bayesian models and out-of-sample regression calibration, motivated by measurement error principles. We then extend this work to the generalized linear model framework for health outcomes. Based on analytical considerations and simulation results, we compare the performance of all these approaches under several spatial models for exposure. Our comparisons underscore several important points. First, exposure simulation can perform very poorly under certain realistic scenarios. Second, the relative performance of the different methods depends on the nature of the underlying exposure surface. Third, traditional measurement error concepts can help to explain the relative practical performance of the different methods. We apply the methods to data on the association between levels of particulate matter and birth weight in the greater Boston area.

[1]  Jon Wakefield,et al.  Health-exposure modeling and the ecological fallacy. , 2005, Biostatistics.

[2]  P. Gustafson,et al.  Conservative prior distributions for variance parameters in hierarchical models , 2006 .

[3]  David Ruppert,et al.  Equivalence of regression calibration methods in main study/external validation study designs , 2003 .

[4]  Marie Davidian,et al.  A Two-Step Approach to Measurement Error in Time-Dependent Covariates in Nonlinear Mixed-Effects Models, with Application to IGF-I Pharmacokinetics , 1997 .

[5]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[6]  F. Gilliland,et al.  Ambient Air Pollution and Atherosclerosis in Los Angeles , 2004, Environmental health perspectives.

[7]  Soyoung Jeon,et al.  Measurement Error caused by Spatial Misalignment in Environmental Epidemiology , 2009 .

[8]  James P. Hobert,et al.  Analyses of Fish Species Richness with Spatial Covariate , 1997 .

[9]  Marie A. Gaudard,et al.  Bayesian spatial prediction , 1999, Environmental and Ecological Statistics.

[10]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[11]  P. Diggle Applied Spatial Statistics for Public Health Data , 2005 .

[12]  D. Ruppert,et al.  Measurement Error in Nonlinear Models , 1995 .

[13]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[14]  David Ruppert,et al.  Regression with spatially misaligned data , 2008 .

[15]  M. Wand,et al.  Geoadditive models , 2003 .

[16]  Li Zhu Hierarchical Regression with Misaligned Spatial Data Relating Ambient Ozone and Pediatric Asthma ER Visits in Atlanta , 2011 .

[17]  Jonathan Rougier,et al.  Discussion of 'Inferring Climate System Properties Using a Computer Model', by Sanso et al. , 2008 .

[18]  G. Shaddick,et al.  Modelling daily multivariate pollutant data at multiple sites , 2002 .

[19]  Roderick J. A. Little Regression with Missing X's: A Review , 1992 .

[20]  S. Chib,et al.  Analysis of multivariate probit models , 1998 .

[21]  Ross L. Prentice,et al.  Likelihood inference in a correlated probit regression model , 1984 .

[22]  D. H. Lee,et al.  The National Institute of Environmental Health Sciences. , 1970, American Industrial Hygiene Association journal.

[23]  Joel Schwartz,et al.  Mortality Risk Associated with Short-Term Exposure to Traffic Particles and Sulfates , 2007, Environmental health perspectives.

[24]  G. Shaddick,et al.  Using a probabilistic model (pCNEM) to estimate personal exposure to air pollution , 2005 .

[25]  Gareth O. Roberts,et al.  Robust Markov chain Monte Carlo Methods for Spatial Generalized Linear Mixed Models , 2006 .

[26]  Claudio J. Verzilli,et al.  A spatial probit model for fine‐scale mapping of disease genes , 2005, Genetic epidemiology.

[27]  Francine Laden,et al.  Submitted to the Annals of Applied Statistics PRACTICAL LARGE-SCALE SPATIO-TEMPORAL MODELING OF PARTICULATE MATTER CONCENTRATIONS By , 2016 .

[28]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[29]  M Thoresen,et al.  A simulation study of measurement error correction methods in logistic regression. , 2000, Biometrics.

[30]  Bradley P. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[31]  J. Schwartz,et al.  Semiparametric latent variable regression models for spatiotemporal modelling of mobile source particles in the greater Boston area , 2007 .

[32]  Sw. Banerjee,et al.  Hierarchical Modeling and Analysis for Spatial Data , 2003 .

[33]  Duncan C. Thomas,et al.  Statistical Issues in Studies of the Long-Term Effects of Air Pollution: The Southern California Children’s Health Study , 2004 .

[34]  Scott M. Berry,et al.  Bayesian Smoothing and Regression Splines for Measurement Error Problems , 2002 .

[35]  Christopher J. Paciorek,et al.  Computational techniques for spatial logistic regression with large data sets , 2007, Comput. Stat. Data Anal..

[36]  A. Gelman Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper) , 2004 .

[37]  Recail M Yucel,et al.  Imputation of Binary Treatment Variables With Measurement Error in Administrative Data , 2005 .

[38]  M. Wand,et al.  Semiparametric Regression: Parametric Regression , 2003 .