Online Sequential Learning Based on Extreme Learning Machines for Particulate Matter Forecasting

Microscopically small solid particles and liquid droplets suspended in the air, known as particulate matter (PM), may significantly affect not only human health but also urban, natural and agricultural systems. Therefore, it is imperative to keep the concentration levels of these pollutants below harmful thresholds. Forecasting tools based on machine learning have been used to estimate the concentration of PM and other pollutants in the atmosphere. However, PM data are uninterruptedly collected over time, thus producing a stream of data whose distribution may evolve over time. As traditional machine learning techniques do not have mechanisms to handle changes on data distribution at running time, they usually present limited prediction accuracy when facing such scenario. The overall goal of this work is to evaluate whether online sequential learning can improve the estimation accuracy of PM forecasting. To do so, online and offline algorithms based on Extreme Learning Machines (ELM) were compared, in order to evaluate their performance when applied to forecast hourly concentrations of PM. The experiments were performed using real-world data streams of PM concentration from different cities of the State of São Paulo, Brazil. The obtained results show not only that online sequential learning approaches lead to smaller mean squared errors but also that the stability of the results is enhanced when such approaches are combined in ensembles.

[1]  Ana Estela Antunes da Silva,et al.  Using Ensembles of Artificial Neural Networks to Improve Pm10 Forecasts , 2015 .

[2]  Narasimhan Sundararajan,et al.  A Fast and Accurate Online Sequential Learning Algorithm for Feedforward Networks , 2006, IEEE Transactions on Neural Networks.

[3]  Rui Araújo,et al.  An on-line weighted ensemble of regressor models to handle concept drifts , 2015, Eng. Appl. Artif. Intell..

[4]  Enrico Zio,et al.  An adaptive online learning approach for Support Vector Regression: Online-SVR-FID , 2016 .

[5]  Rui Araújo,et al.  A dynamic and on-line ensemble regression for changing environments , 2015, Expert Syst. Appl..

[6]  On the development of an intelligent system for particulate matter air pollution monitoring, analysis and forecasting in urban regions , 2015, 2015 19th International Conference on System Theory, Control and Computing (ICSTCC).

[7]  Adriano Lorena Inácio de Oliveira,et al.  An approach to handle concept drift in financial time series based on Extreme Learning Machines and explicit Drift Detection , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[8]  Eros Pasero,et al.  Data-driven models to forecast PM10 concentration , 2007, 2007 International Joint Conference on Neural Networks.

[9]  R. Suganya,et al.  Data Mining Concepts and Techniques , 2010 .

[10]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[11]  Abdullah Kadri,et al.  Urban Air Pollution Monitoring System With Forecasting Models , 2016, IEEE Sensors Journal.

[12]  Amedeo D'Angiulli,et al.  Megacities air pollution problems: Mexico City Metropolitan Area critical issues on the central nervous system pediatric impact. , 2015, Environmental research.