Tree-based threshold modeling for short-term forecast of daily maximum ozone level

This paper proposes a simple class of threshold autoregressive model for purpose of forecasting daily maximum ozone concentrations in Southern California. Linear time series model has been widely considered in environmental modeling. However, this class of models fails to capture the nonlinearity in ozone process and the complexity of meteorological interactions with ozone. In this article, we used the threshold autoregressive models with two classes of regimes; periodic and meteorological regimes. Days in week were used for the periodic regimes and the regression tree method was used to define the regimes as a function of meteorological variables. As the reference model we used the autoregressive model with lagged ozone and various lagged meteorological variables as the covariates. The proposed models were applied to a 3-year dataset of daily maximum ozone concentrations obtained from five monitoring stations in San Bernardino County, CA and their forecast performances were evaluated using an independent year-long dataset from the same stations. The results showed that the threshold models well capture the nonlinearity in ozone process and remove the nonstationarity in model residuals. The threshold models outperformed the non-threshold autoregressive models in day-ahead forecasts. The tree-based model showed slightly better performance than the periodic threshold model.

[1]  P. Guttorp,et al.  A review of statistical methods for the meteorological adjustment of tropospheric ozone , 2001 .

[2]  J. Andrew Royle,et al.  Accounting for meteorological effects in measuring urban ozone levels and trends , 1996 .

[3]  Douglas W. Nychka,et al.  Case Studies in Environmental Statistics , 1998 .

[4]  G. C. Tiao,et al.  Some advances in non‐linear and adaptive modelling in time‐series , 1994 .

[5]  N. K. Bellam,et al.  Performance of an industrial source complex model: Predicting long‐term concentrations in an urban area , 1999 .

[6]  A. Fassò,et al.  Non‐linear statistical modelling of high frequency ground ozone data , 2002 .

[7]  U. Feister,et al.  Surface ozone and meteorological predictors on a subregional scale , 1991 .

[8]  Richard L. Smith,et al.  Meteorologically‐dependent trends in urban ozone , 1999 .

[9]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[10]  Walter Krämer,et al.  Review of Modern applied statistics with S, 4th ed. by W.N. Venables and B.D. Ripley. Springer-Verlag 2002 , 2003 .

[11]  Hung Man Tong,et al.  Threshold models in non-linear time series analysis. Lecture notes in statistics, No.21 , 1983 .

[12]  H. Akaike A new look at the statistical model identification , 1974 .

[13]  Bruce E. Hansen,et al.  Threshold effects in non-dynamic panels: Estimation, testing, and inference , 1999 .

[14]  M. Deistler,et al.  Time series models for short term forecasting of ozone in the eastern part of Austria , 2001 .

[15]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[16]  Anna Clara Monti A proposal for a residual autocorrelation test in linear models , 1994 .

[17]  D. Assimacopoulos,et al.  Forecasting Daily Maximum Ozone Concentrations in the Athens Basin , 1999 .

[18]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[19]  Scott M. Robeson,et al.  Evaluation and comparison of statistical forecast models for daily maximum ozone concentrations , 1990 .

[20]  Christina Gloeckner,et al.  Modern Applied Statistics With S , 2003 .

[21]  William R. Burrows,et al.  CART Decision-Tree Statistical Analysis and Prediction of Summer Season Maximum Surface Ozone for the Vancouver, Montreal, and Atlantic Regions of Canada , 1995 .

[22]  Simon M. Potter A Nonlinear Approach to US GNP , 1995 .

[23]  Brian K. Eder,et al.  Modeling Ozone in the Chicago Urban Area , 1998 .

[24]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[25]  G. Box,et al.  On a measure of lack of fit in time series models , 1978 .

[26]  Sung Eun Kim,et al.  Accounting seasonal nonstationarity in time series models for short-term ozone level forecast , 2005 .

[27]  Trevor D. Davies,et al.  Regression and stochastic models for air pollution—I. Review, comments and suggestions , 1994 .