Statistical models for multi-step-ahead forecasting of fine particulate matter in urban areas

Abstract In recent years, the atmospheric pollution in most metropolitan cities has become a crisis and the necessity of air quality forecasting has increased. Among different air pollutants, PM2.5 is considered as the major air pollutant in urbanized regions, especially because of serious harmful health effects on human being. So, there is an urgent need to develop air quality forecast programs capable of providing accurate predictions over a long future horizon. Predicting PM2.5 concentrations for several steps ahead of time is of great interest, especially in decision-making related to control policies and emergency measures such as traffic limitations, school closures, or temporarily shutting down major polluting industrial units. In this paper, commonly used multi-step ahead prediction strategies, including Recursive (Rec), Direct (Dir), Direct-Recursive (DirRec), Multi-Input Multi-Output (MIMO) and Direct-MIMO (DIRMO) along with Autoregressive integrated moving average with exogenous variables (ARIMAX) and Multi-Layer Perceptron (MLP) modelling techniques are examined. Also, the independent variables are considered as time series variables and are forecasted using ARIMA/MLP model in order to be used for prediction of the dependent variables in multi-steps ahead of time. The experimental study is performed using PM2.5 data in Mashhad, Iran. Daily PM2.5 forecasts for this city is provided for the next 10 days. Four different feature selection methods are also implemented and compared. The results indicate that recursive strategy with LASSO feature selection in ARIMAX model overcomes in most of time steps.

[1]  Ferhat Karaca,et al.  Particulate Matter (PM2.5, PM10-2.5, and PM10) and Children's Hospital Admissions for Asthma and Respiratory Diseases: A Bidirectional Case-Crossover Study , 2008, Journal of toxicology and environmental health. Part A.

[2]  Ioannis P. Panapakidis,et al.  Day-ahead electricity price forecasting via the application of artificial neural network based models , 2016 .

[3]  Giorgio Corani,et al.  Air pollution prediction via multi-label classification , 2016, Environ. Model. Softw..

[4]  Yves Rybarczyk,et al.  Modeling PM2.5 Urban Pollution Using Machine Learning and Selected Meteorological Parameters , 2017, J. Electr. Comput. Eng..

[5]  Weidong Zhang,et al.  Prediction of 24-hour-average PM(2.5) concentrations using a hidden Markov model with different emission distributions in Northern California. , 2013, The Science of the total environment.

[6]  Li Li,et al.  Application Study of Comprehensive Forecasting Model Based on Entropy Weighting Method on Trend of PM2.5 Concentration in Guangzhou, China , 2015, International journal of environmental research and public health.

[7]  Xiaobo Zhang,et al.  Developing an early-warning system for air quality prediction and assessment of cities in China , 2017, Expert Syst. Appl..

[8]  Washington Leite Junger,et al.  Imputation of missing data in time series for air pollutants , 2015 .

[9]  Gianluca Bontempi,et al.  Long Term Time Series Prediction with Multi-Input Multi-Output Local Learning , 2008 .

[10]  M. W Gardner,et al.  Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences , 1998 .

[11]  Haiyan Lu,et al.  Application of a novel early warning system based on fuzzy time series in urban air quality forecasting in China , 2018, Appl. Soft Comput..

[12]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[13]  Marcella Busilacchio,et al.  Recursive neural network model for analysis and forecast of PM10 and PM2.5 , 2017 .

[14]  Joaquín B. Ordieres Meré,et al.  Neural network prediction model for fine particulate matter (PM2.5) on the US-Mexico border in El Paso (Texas) and Ciudad Juárez (Chihuahua) , 2005, Environ. Model. Softw..

[15]  Kewei Cheng,et al.  Feature Selection , 2016, ACM Comput. Surv..

[16]  Zhongyi Hu,et al.  Beyond One-Step-Ahead Forecasting: Evaluation of Alternative Multi-Step-Ahead Forecasting Models for Crude Oil Prices , 2013, ArXiv.

[17]  W. Enders Applied Econometric Time Series , 1994 .

[18]  Amir F. Atiya,et al.  A review and comparison of strategies for multi-step ahead time series forecasting based on the NN5 forecasting competition , 2011, Expert Syst. Appl..

[19]  Binxu Zhai,et al.  Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China. , 2018, The Science of the total environment.

[20]  Yuqi Bai,et al.  Development of nonlinear empirical models to forecast daily PM2.5 and ozone levels in three large Chinese cities , 2016 .

[21]  Jianzhou Wang,et al.  A hybrid model for PM₂.₅ forecasting based on ensemble empirical mode decomposition and a general regression neural network. , 2014, The Science of the total environment.

[22]  Ömer Kaan Baykan,et al.  Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange , 2011, Expert Syst. Appl..

[23]  Antti Sorjamaa,et al.  Multiple-output modeling for multi-step-ahead time series forecasting , 2010, Neurocomputing.

[24]  Beibei Sun,et al.  Analysis and forecasting of the particulate matter (PM) concentration levels over four major cities of China using hybrid models , 2014 .

[25]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[26]  J. Kukkonen,et al.  Intercomparison of air quality data using principal component analysis, and forecasting of PM₁₀ and PM₂.₅ concentrations using artificial neural networks, in Thessaloniki and Helsinki. , 2011, The Science of the total environment.

[27]  Patricio Perez,et al.  Forecasting hourly PM2.5 in Santiago de Chile with emphasis on night episodes , 2016 .

[28]  Morton Lippmann,et al.  Toxicological and epidemiological studies of cardiovascular effects of ambient air fine particulate matter (PM2.5) and its chemical components: Coherence and public health implications , 2014, Critical reviews in toxicology.

[29]  Amaury Lendasse,et al.  Time series prediction using DirRec strategy , 2006, ESANN.

[30]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[31]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[32]  Hong Huang,et al.  Relevance analysis and short-term prediction of PM2.5 concentrations in Beijing based on multi-source data , 2017 .

[33]  C. Linares,et al.  Short-term impact of particulate matter (PM2.5) on respiratory mortality in Madrid , 2011, International journal of environmental health research.

[34]  B. R. Gurjar,et al.  Potential Assessment of Neural Network and Decision Tree Algorithms for Forecasting Ambient PM2.5 and CO Concentrations: Case Study , 2016 .

[35]  Xiaodao Chen,et al.  PM2.5 forecasting with hybrid LSE model‐based approach , 2017, Softw. Pract. Exp..

[36]  Jorge J. Moré,et al.  The Levenberg-Marquardt algo-rithm: Implementation and theory , 1977 .

[37]  Patricio Perez,et al.  PM2.5 forecasting in a large city: Comparison of three methods , 2008 .

[38]  George Athanasopoulos,et al.  Forecasting: principles and practice , 2013 .

[39]  D. Cox Prediction by Exponentially Weighted Moving Averages and Related Methods , 1961 .

[40]  Feng Yu,et al.  A short-term load forecasting model of natural gas based on optimized genetic algorithm and improved BP neural network , 2014 .

[41]  Yang Zhang,et al.  Real-time air quality forecasting, part I: History, techniques, and current status , 2012 .

[42]  Koji Zettsu,et al.  Dynamically pre-trained deep recurrent neural networks using environmental monitoring data for predicting PM2.5 , 2015, Neural Computing and Applications.

[43]  Ujjwal Kumar,et al.  ARIMA forecasting of ambient air pollutants (O3, NO, NO2 and CO) , 2010 .

[44]  George D. C. Cavalcanti,et al.  Hybrid intelligent system for air quality forecasting using phase adjustment , 2014, Eng. Appl. Artif. Intell..

[45]  Ping Jiang,et al.  A novel hybrid strategy for PM2.5 concentration analysis and prediction. , 2017, Journal of environmental management.

[46]  Guangjie Han,et al.  RAQ–A Random Forest Approach for Predicting Air Quality in Urban Sensing Systems , 2016, Sensors.

[47]  Diyar Akay,et al.  Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting , 2009, Expert Syst. Appl..

[48]  Ning An,et al.  Using multi-output feedforward neural network with empirical mode decomposition based signal filtering for electricity demand forecasting , 2013 .

[49]  Dong Yang,et al.  PM2.5 concentration prediction using hidden semi-Markov model-based times series data mining , 2009, Expert Syst. Appl..

[50]  Amaury Lendasse,et al.  Long-term prediction of time series by combining direct and MIMO strategies , 2009, 2009 International Joint Conference on Neural Networks.

[51]  Hamidreza Zareipour,et al.  Wind power forecast using wavelet neural network trained by improved Clonal selection algorithm , 2015 .

[52]  M. Leiva G.,et al.  A five-year study of particulate matter (PM2.5) and cerebrovascular diseases. , 2013, Environmental pollution.

[53]  Jianzhou Wang,et al.  A novel hybrid model based on artificial neural networks for solar radiation prediction , 2016 .

[54]  J. Chow,et al.  A hybrid ARIMA and artificial neural networks model to forecast particulate matter in urban areas: The case of Temuco, Chile , 2008 .

[55]  A. Peters,et al.  Particulate Matter Air Pollution and Cardiovascular Disease: An Update to the Scientific Statement From the American Heart Association , 2010, Circulation.

[56]  Yue-Xia Wang,et al.  Quantifying PM 2.5 Concentrations From Multi-Weather Sensors Using Hidden Markov Models , 2016 .

[57]  In-Bong Kang,et al.  Multi-period forecasting using different models for different horizons: an application to U.S. economic time series data , 2003 .

[58]  John S. Brownstein,et al.  Association Between Fine Particulate Matter and Diabetes Prevalence in the U.S. , 2010, Diabetes Care.

[59]  Douglas Kline,et al.  Methods for Multi-Step Time Series Forecasting Neural Networks , 2004 .

[60]  Ujjwal Kumar,et al.  A Wavelet-based Neural Network Model to Predict Ambient Air Pollutants’ Concentration , 2011 .

[61]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[62]  Gianluca Bontempi,et al.  Conditionally dependent strategies for multiple-step-ahead prediction in local learning , 2011 .

[63]  Ian G. McKendry,et al.  Evaluation of Artificial Neural Networks for Fine Particulate Pollution (PM10 and PM2.5) Forecasting , 2002, Journal of the Air & Waste Management Association.

[64]  Qi Li,et al.  Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation , 2015 .

[65]  Cun-Hui Zhang Nearly unbiased variable selection under minimax concave penalty , 2010, 1002.4734.

[66]  Beate Ritz,et al.  Multi-pollutant exposure profiles associated with term low birth weight in Los Angeles County. , 2016, Environment international.

[67]  Hong Zhang,et al.  A novel hybrid-Garch model based on ARIMA and SVM for PM2.5 concentrations forecasting , 2017 .

[68]  Mikko Kolehmainen,et al.  Evaluation of an integrated modelling system containing a multi-layer perceptron model and the numerical weather prediction model HIRLAM for the forecasting of urban airborne pollutant concentrations , 2005 .

[69]  J. Kamińska,et al.  The use of random forests in modelling short-term air pollution effects based on traffic and meteorological conditions: A case study in Wrocław. , 2018, Journal of environmental management.

[70]  Jorge Reyes,et al.  Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile , 2000 .

[71]  Michael H. Kutner Applied Linear Statistical Models , 1974 .

[72]  Desheng Dash Wu,et al.  A soft computing system for day-ahead electricity price forecasting , 2010, Appl. Soft Comput..

[73]  Yin Zhao,et al.  Fine Particulate Matter Concentration Level Prediction by using Tree-based Ensemble Classification Algorithms , 2013 .