Statistical models to assess the health effects and to forecast ground-level ozone

By means of statistical approaches we attempt to bridge both aspects of the ground-level ozone problem: assessment of health effects and forecasting and warning. Disagreement has been highlighted in the literature recently regarding the adverse health effects of tropospheric ozone pollution. Based on a panel study of children in Leipzig we identified a non-linear (quadratic) concentration-response relationship between ozone and respiratory symptoms. Our results indicate that using ozone as a linear covariate might be a misspecification of the model, which might explain non-uniform results of several field studies in health effects of ozone. We conclude that there is urgent demand for forecasting episodes of high ozone that may help susceptible persons to avoid high exposure. Novel approaches to statistical modelling and data mining are helpful tools in operational smog forecasting. We present a rigorous assessment of the performance of 15 different statistical techniques in an inter-comparison study based on data sets from 10 European regions. To evaluate the results of the inter-comparison exercise we suggest an integrated assessment procedure, which takes the unbalanced study design into consideration. This procedure is based on estimating a statistical model for the performance indices depending on predefined factors, such as site, forecasting technique, forecasting horizon, etc. We find that the best predictions can be achieved for sites located in rural and suburban areas in Central Europe. For application in operational air pollution forecasting we may recommend neural network and generalised additive models, which can handle non-linear associations between atmospheric variables. As an example we demonstrate the application of a Generalised Additive Model (GAM). GAMs are based on smoothing splines for the covariates, i.e., meteorological parameters and concentrations of other pollutants. Finally, it transpired that respiratory symptoms are associated with the daily maximum of the 8-h average ozone concentration, which in turn is best predicted by means of non-linear statistical models. The new air quality directive of the European Commission (Directive 2002/3/EC) accounts for the special relevance of the 8h mean ozone concentration.

[1]  Trevor D. Davies,et al.  Regression and stochastic models for air pollution—I. Review, comments and suggestions , 1994 .

[2]  S I Bangdiwala,et al.  Ozone, suspended particulates, and daily mortality in Mexico City. , 1997, American journal of epidemiology.

[3]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[4]  J. M. Hammersley,et al.  The “Effective” Number of Independent Observations in an Autocorrelated Time Series , 1946 .

[5]  R. Tibshirani,et al.  Generalized additive models for medical research , 1986, Statistical methods in medical research.

[6]  Kasım Koçak,et al.  Nonlinear time series prediction of O3 concentration in Istanbul , 2000 .

[7]  Wray L. Buntine,et al.  Bayesian Back-Propagation , 1991, Complex Syst..

[8]  Jong-Tae Lee,et al.  Air pollution and daily mortality in Seoul and Ulsan, Korea. , 1999, Environmental health perspectives.

[9]  G. Orcutt,et al.  TESTING THE SIGNIFICANCE OF CORRELATION BETWEEN TIME SERIES , 1948 .

[10]  Gavin C. Cawley,et al.  A rigorous inter-comparison of ground-level ozone predictions , 2003 .

[11]  S L Zeger,et al.  Air pollution and mortality in Philadelphia, 1974-1988. , 1997, American journal of epidemiology.

[12]  Marc Saez,et al.  Time‐Series Analysis of Air Pollution and Cause Specific Mortality , 1998, Epidemiology.

[13]  Susan A. Murphy,et al.  Monographs on statistics and applied probability , 1990 .

[14]  F. Takens Detecting strange attractors in turbulence , 1981 .

[15]  P. M. Williams,et al.  Using Neural Networks to Model Conditional Multivariate Densities , 1996, Neural Computation.

[16]  Qing Yang,et al.  Modeling the effects of meteorology on ozone in Houston using cluster analysis and generalized additive models , 1998 .

[17]  Robert E. Davis,et al.  Statistics for the evaluation and comparison of models , 1985 .

[18]  M. C. Hubbard,et al.  Development of a regression model to forecast ground-level ozone concentration in Louisville, KY , 1998 .

[19]  Paul L. Speckman,et al.  A model for predicting maximum and 8 h average ozone in Houston , 1999 .

[20]  Andrew Harvey,et al.  Forecasting, Structural Time Series Models and the Kalman Filter , 1990 .

[21]  M. W Gardner,et al.  Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences , 1998 .

[22]  Bernard Delyon,et al.  Wavelets in identification , 1994, Fuzzy logic and expert systems applications.

[23]  Uwe Schlink,et al.  Critical Reconsideration of Phase Space Embedding and Local Non-Parametric Prediction of Ozone Time Series , 2002 .

[24]  J. Schwartz,et al.  Effects of ambient particulate matter and ozone on daily mortality in Rotterdam, The Netherlands. , 1997, Archives of environmental health.

[25]  J. Schwartz,et al.  Harvesting and long term exposure effects in the relation between air pollution and mortality. , 2000, American journal of epidemiology.

[26]  Giuseppe Nunnari Modelling air pollution time-series by using wavelet functions and genetic algorithms , 2004, Soft Comput..

[27]  S. Islam,et al.  Nonlinear dynamics of hourly ozone concentrations. nonparametric short term prediction , 1998 .

[28]  Yadolah Dodge,et al.  Mathematical Programming In Statistics , 1981 .

[29]  A S Whittemore,et al.  Methods for analyzing panel studies of acute health effects of air pollution. , 1979, Biometrics.

[30]  Chris Park,et al.  Environment and health: A.J. Rowland and P. Cooper, 205 pp., 1983, Edward Arnold, London, £8.75 , 1985 .

[31]  John R. Stedman,et al.  A trajectory model of the relationship between ozone and precursor emissions , 1992 .

[32]  C. Bretherton,et al.  The Effective Number of Spatial Degrees of Freedom of a Time-Varying Field , 1999 .

[33]  Harri Niska,et al.  Methods for imputation of missing values in air quality data sets , 2004 .

[34]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[35]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[36]  Stephen Dorling,et al.  Statistical surface ozone models: an improved methodology to account for non-linear behaviour , 2000 .

[37]  Gavin C. Cawley,et al.  Air Pollution Episodes: Modelling Tools For Improved Smog Management(APPETISE) , 2000 .