Forecasting Air Quality Index Using an Ensemble of Artificial Neural Networks and Regression Models

Abstract Air is the most essential constituent for the sustenance of life on earth. The air we inhale has a tremendous impact on our health and well-being. Hence, it is always advisable to monitor the quality of air in our environment. To forecast the air quality index (AQI), artificial neural networks (ANNs) trained with conjugate gradient descent (CGD), such as multilayer perceptron (MLP), cascade forward neural network, Elman neural network, radial basis function (RBF) neural network, and nonlinear autoregressive model with exogenous input (NARX) along with regression models such as multiple linear regression (MLR) consisting of batch gradient descent (BGD), stochastic gradient descent (SGD), mini-BGD (MBGD) and CGD algorithms, and support vector regression (SVR), are implemented. In these models, the AQI is the dependent variable and the concentrations of NO2, CO, O3, PM2.5, SO2, and PM10 for the years 2010–2016 in Houston and Los Angeles are the independent variables. For the final forecast, several ensemble models of individual neural network predictors and individual regression predictors are presented. This proposed approach performs with the highest efficiency in terms of forecasting air quality index.

[1]  Ayse Betül Oktay,et al.  Forecasting air pollutant indicator levels with geographic models 3 days in advance using neural networks , 2010, Expert Syst. Appl..

[2]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[3]  D. Broomhead,et al.  Radial Basis Functions, Multi-Variable Functional Interpolation and Adaptive Networks , 1988 .

[4]  Elia Georgiana Dragomir,et al.  Air Quality Index Prediction using K-Nearest Neighbor Technique , 2010 .

[5]  Leo Breiman,et al.  Stacked regressions , 2004, Machine Learning.

[6]  Andrew D. Back,et al.  Radial Basis Functions , 2001 .

[7]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[8]  Yoshua Bengio,et al.  On the Expressive Power of Deep Architectures , 2011, ALT.

[9]  Diane J. Cook,et al.  Predicting air quality in smart environments , 2010, J. Ambient Intell. Smart Environ..

[10]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[12]  M. Hestenes,et al.  Methods of conjugate gradients for solving linear systems , 1952 .

[13]  Padhraic Smyth,et al.  Linearly Combining Density Estimators via Stacking , 1999, Machine Learning.

[14]  Michael Y. Hu,et al.  Forecasting with artificial neural networks: The state of the art , 1997 .

[15]  Xin Yan,et al.  Linear Regression Analysis: Theory and Computing , 2009 .

[16]  L. Breiman Stacked Regressions , 1996, Machine Learning.

[17]  P. Goyal,et al.  Statistical models for the prediction of respirable suspended particulate matter in urban cities , 2006 .

[18]  Godfrey A. Walters,et al.  Symbolic and numerical regression: experiments and applications , 2003, Inf. Sci..

[19]  Archontoula Chaloulakou,et al.  Comparative assessment of neural networks and regression models for forecasting summertime ozone in Athens. , 2003, The Science of the total environment.

[20]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[21]  Ari Karppinen,et al.  Evaluation of a multiple regression model for the forecasting of the concentrations of NOx and PM10 in Athens and Helsinki. , 2011, The Science of the total environment.