A critical assessment of shrinkage-based regression approaches for estimating the adverse health effects of multiple air pollutants

Most investigations of the adverse health effects of multiple air pollutants analyse the time series involved by simultaneously entering the multiple pollutants into a Poisson log-linear model. Concerns have been raised about this type of analysis, and it has been stated that new methodology or models should be developed for investigating the adverse health effects of multiple air pollutants. In this paper, we introduce the use of the lasso for this purpose and compare its statistical properties to those of ridge regression and the Poisson log-linear model. Ridge regression has been used in time series analyses on the adverse health effects of multiple air pollutants but its properties for this purpose have not been investigated. A series of simulation studies was used to compare the performance of the lasso, ridge regression, and the Poisson log-linear model. In these simulations, realistic mortality time series were generated with known air pollution mortality effects permitting the performance of the three models to be compared. Both the lasso and ridge regression produced more accurate estimates of the adverse health effects of the multiple air pollutants than those produced using the Poisson log-linear model. This increase in accuracy came at the expense of increased bias. Ridge regression produced more accurate estimates than the lasso, but the lasso produced more interpretable models. The lasso and ridge regression offer a flexible way of obtaining more accurate estimation of pollutant effects than that provided by the standard Poisson log-linear model.

[1]  J Schwartz,et al.  Air pollution and daily mortality in seven major cities of Korea, 1991-1997. , 2000, Environmental research.

[2]  Michael Brauer,et al.  Air pollution and daily mortality in a city with low levels of pollution. , 2003, Environmental health perspectives.

[3]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[4]  Kazuhiko Ito,et al.  Variations in PM-10 Concentrations Within two Metropolitan Areas and Their Implications for Health Effects Analyses , 1995 .

[5]  S. Roberts An Investigation of Distributed Lag Models in the Context of Air Pollution and Mortality Time Series Analysis , 2005, Journal of the Air & Waste Management Association.

[6]  Chun-Yuh Yang,et al.  Relationship Between Ambient Air Pollution and Hospital Admissions for Cardiovascular Diseases in Kaohsiung, Taiwan , 2004, Journal of toxicology and environmental health. Part A.

[7]  S. Moolgavkar Air Pollution and Daily Mortality in Two U.S. Counties: Season-Specific Analyses and Exposure-Response Relationships , 2003, Inhalation toxicology.

[8]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[9]  A Study of the Association between Daily Mortality and Ambient Air Pollutant Concentrations in Pittsburgh, Pennsylvania , 2000, Journal of the Air & Waste Management Association.

[10]  Jerome Sacks,et al.  Regression models for air pollution and daily mortality: analysis of data from Birmingham, Alabama , 2000 .

[11]  David M Stieb,et al.  Meta-Analysis of Time-Series Studies of Air Pollution and Mortality: Effects of Gases and Particles and the Influence of Cause of Death, Age, and Season , 2002, Journal of the Air & Waste Management Association.

[12]  J Lellouch,et al.  Short-term effects of sulphur dioxide pollution on mortality in two French cities. , 1989, International journal of epidemiology.

[13]  Lawrence H. Cox Statistical issues in the study of air pollution involving airborne particulate matter , 2000 .

[14]  R. Burnett,et al.  Risk Models for Particulate Air Pollution , 2003, Journal of toxicology and environmental health. Part A.

[15]  L. Lave,et al.  Effect of the Fine Fraction of Particulate Matter versus the Coarse Mass and Other Pollutants on Daily Mortality in Santiago, Chile , 2000, Journal of the Air & Waste Management Association.

[16]  D. Christiani,et al.  PM(10) exposure, gaseous pollutants, and daily mortality in Inchon, South Korea. , 1999, Environmental health perspectives.

[17]  S. Hales,et al.  Daily mortality in relation to weather and air pollution in Christchurch, New Zealand , 2000, Australian and New Zealand journal of public health.

[18]  S. Roberts Interactions between particulate air pollution and temperature in air pollution mortality time series studies. , 2004, Environmental research.

[19]  B. Ostro,et al.  Air pollution and daily mortality in the Coachella Valley, California: a study of PM10 dominated by coarse particles. , 1999, Environmental research.

[20]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[21]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[22]  S. Rahlenbeck,et al.  Air pollution and mortality in East Berlin during the winters of 1981-1989. , 1996, International journal of epidemiology.

[23]  S L Zeger,et al.  Estimating particulate matter-mortality dose-response curves and threshold levels: an analysis of daily time-series for the 20 largest US cities. , 2000, American journal of epidemiology.

[24]  T. Wong,et al.  Associations between daily mortalities from respiratory and cardiovascular diseases and air pollution in Hong Kong, China , 2002, Occupational and environmental medicine.

[25]  Particulate matter, sulfur dioxide, and daily mortality in Chongqing, China. , 2003 .