Reduced-Rank Spatio-Temporal Modeling of Air Pollution Concentrations in the Multi-Ethnic Study of Atherosclerosis and Air Pollution.

There is growing evidence in the epidemiologic literature of the relationship between air pollution and adverse health outcomes. Prediction of individual air pollution exposure in the Environmental Protection Agency (EPA) funded Multi-Ethnic Study of Atheroscelerosis and Air Pollution (MESA Air) study relies on a flexible spatio-temporal prediction model that integrates land-use regression with kriging to account for spatial dependence in pollutant concentrations. Temporal variability is captured using temporal trends estimated via modified singular value decomposition and temporally varying spatial residuals. This model utilizes monitoring data from existing regulatory networks and supplementary MESA Air monitoring data to predict concentrations for individual cohort members. In general, spatio-temporal models are limited in their efficacy for large data sets due to computational intractability. We develop reduced-rank versions of the MESA Air spatio-temporal model. To do so, we apply low-rank kriging to account for spatial variation in the mean process and discuss the limitations of this approach. As an alternative, we represent spatial variation using thin plate regression splines. We compare the performance of the outlined models using EPA and MESA Air monitoring data for predicting concentrations of oxides of nitrogen (NO x )-a pollutant of primary interest in MESA Air-in the Los Angeles metropolitan area via cross-validated R2. Our findings suggest that use of reduced-rank models can improve computational efficiency in certain cases. Low-rank kriging and thin plate regression splines were competitive across the formulations considered, although TPRS appeared to be more robust in some settings.

[1]  B. Ripley,et al.  Semiparametric Regression: Preface , 2003 .

[2]  M. Fuentes Approximate Likelihood for Large Irregularly Spaced Spatial Data , 2007, Journal of the American Statistical Association.

[3]  J. Nocedal,et al.  A Limited Memory Algorithm for Bound Constrained Optimization , 1995, SIAM J. Sci. Comput..

[4]  P. Diggle,et al.  Bivariate Binomial Spatial Modeling of Loa loa Prevalence in Tropical Africa , 2008 .

[5]  Michael L. Stein,et al.  A modeling approach for large spatial datasets , 2008 .

[6]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[7]  D. Dockery,et al.  Particulate air pollution as a predictor of mortality in a prospective study of U.S. adults. , 1995, American journal of respiratory and critical care medicine.

[8]  A. Gelfand,et al.  Gaussian predictive process models for large spatial data sets , 2008, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[9]  S. Adar,et al.  Vascular Responses and Long-Term Ambient Air Pollution: The Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). , 2009, ATS 2009.

[10]  Altaf Arain,et al.  A review and evaluation of intraurban air pollution exposure models , 2005, Journal of Exposure Analysis and Environmental Epidemiology.

[11]  Michael L. Stein,et al.  Spatial variation of total column ozone on a global scale , 2007, 0709.0394.

[12]  Bert Brunekreef,et al.  Estimating Long-Term Average Particulate Air Pollution Concentrations: Application of Traffic Indicators and Geographic Information Systems , 2003, Epidemiology.

[13]  P. Sampson,et al.  Pragmatic Estimation of a Spatio-Temporal Air Quality Model With Irregular Monitoring Data , 2011 .

[14]  S. Wood Thin plate regression splines , 2003 .

[15]  P. Sampson,et al.  A flexible spatio-temporal model for air pollution with spatial and spatio-temporal covariates , 2014, Environmental and Ecological Statistics.

[16]  Peter Guttorp,et al.  Using Transforms to Analyze Space-Time Processes , 2007 .

[17]  Bert R. Meijboom,et al.  Review and Evaluation , 1987 .

[18]  Random Effects Old and New , 2011 .

[19]  D. Harville Matrix Algebra From a Statistician's Perspective , 1998 .

[20]  James S. Hodges,et al.  Richly Parameterized Linear Models: Additive, Time Series, and Spatial Models Using Random Effects , 2013 .

[21]  Thomas Lumley,et al.  Predicting intra‐urban variation in air pollution concentrations with complex spatio‐temporal dependencies , 2009, Environmetrics.

[22]  Beate Ritz,et al.  Air Pollution and Infant Death in Southern California, 1989–2000 , 2006, Pediatrics.

[23]  Francine Laden,et al.  Submitted to the Annals of Applied Statistics PRACTICAL LARGE-SCALE SPATIO-TEMPORAL MODELING OF PARTICULATE MATTER CONCENTRATIONS By , 2016 .

[24]  T. Chico,et al.  Addendum to the User's Guide for MPTER , 1986 .

[25]  J. Gulliver,et al.  A review of land-use regression models to assess spatial variation of outdoor air pollution , 2008 .

[26]  G. Wahba Spline Interpolation and Smoothing on the Sphere , 1981 .

[27]  J. Sarnat,et al.  Fine particulate air pollution and mortality in 20 U.S. cities. , 2001, The New England journal of medicine.

[28]  D. Dockery,et al.  An association between air pollution and mortality in six U.S. cities. , 1993, The New England journal of medicine.

[29]  Jonathan R. Stroud,et al.  Dynamic models for spatiotemporal data , 2001 .

[30]  Eger,et al.  Fine particulate air pollution and mortality in 20 U.S. cities, 1987-1994. , 2000, The New England journal of medicine.

[31]  Thomas Lumley,et al.  Prospective study of particulate air pollution exposures, subclinical atherosclerosis, and clinical cardiovascular disease: The Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). , 2012, American journal of epidemiology.

[32]  M. Wand,et al.  Geoadditive models , 2003 .

[33]  B. Silverman,et al.  Nonparametric regression and generalized linear models , 1994 .

[34]  F. Gilliland,et al.  Ambient Air Pollution and Atherosclerosis in Los Angeles , 2004, Environmental health perspectives.

[35]  D. Nychka Spatial‐Process Estimates as Smoothers , 2012 .

[36]  Johan Lindström,et al.  A Unified Spatiotemporal Modeling Approach for Predicting Concentrations of Multiple Air Pollutants in the Multi-Ethnic Study of Atherosclerosis and Air Pollution , 2014, Environmental health perspectives.

[37]  G. Wahba,et al.  A Correspondence Between Bayesian Estimation on Stochastic Processes and Smoothing by Splines , 1970 .

[38]  R. Burnett,et al.  Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. , 2002, JAMA.

[39]  Haotian Hang,et al.  Inconsistent Estimation and Asymptotically Equal Interpolations in Model-Based Geostatistics , 2004 .

[40]  Douglas W. Nychka,et al.  Design of Air-Quality Monitoring Networks , 1998 .

[41]  R. Burnett,et al.  Spatial Analysis of Air Pollution and Mortality in Los Angeles , 2005, Epidemiology.

[42]  Christopher A. Barnes,et al.  Completion of the 2006 National Land Cover Database for the conterminous United States. , 2011 .

[43]  Lianne Sheppard,et al.  Approach to estimating participant pollutant exposures in the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). , 2009, Environmental science & technology.

[44]  James P. LeSage,et al.  A sampling approach to estimate the log determinant used in spatial likelihood problems , 2009, J. Geogr. Syst..

[45]  B. Silverman,et al.  Nonparametric Regression and Generalized Linear Models: A roughness penalty approach , 1993 .

[46]  Lianne Sheppard,et al.  Efficient measurement error correction with spatially misaligned data. , 2011, Biostatistics.

[47]  Alan E. Gelfand,et al.  Spatial process modelling for univariate and multivariate dynamic spatial data , 2005 .

[48]  Christopher J Paciorek,et al.  Measurement error in two‐stage analyses, with application to air pollution epidemiology , 2012, Environmetrics.

[49]  L. Sheppard,et al.  Long-term exposure to air pollution and incidence of cardiovascular events in women. , 2007, The New England journal of medicine.