An innovative ensemble learning air pollution early-warning system for China based on incremental extreme learning machine

Abstract Air pollution has lots of adverse effects on industrial production and public life. Thus, it is an urgent task to construct an efficient air quality early-warning system to guide public life and production. This paper proposes an innovative air pollution early-warning system, including four main modules: clustering, preprocessing, forecasting and evaluation. In the clustering module, with the aim of building an efficient air pollution warning system, the air pollution situation of 31 provincial capitals is clustered and the study areas of the current study are selected based on the clustering result. A new data preprocessing algorithm is conducted to excavate the potential characteristics of the raw time series in the first place in the preprocessing module. Then, the length-changeable incremental extreme learning machine is used to forecast each component. In the evaluation module, the air quality is qualitatively analyzed by the fuzzy evaluation method. Moreover, the DM test and the SPA test are employed to test the accuracy of the forecasting model. The experimental results of eighteen data sets from three cities show that the hybrid air quality early-warning system establish in the study not only has higher accuracy and generalization ability than other benchmark models, but can provide sufficient air quality information, which is essential to control air pollution.

[1]  Gui-zhou Wang,et al.  Air quality predictions with a semi-supervised bidirectional LSTM neural network , 2020 .

[2]  Fakhra Anwar,et al.  Causes of Ozone Layer Depletion and Its Effects on Human: Review , 2016 .

[3]  F. Diebold,et al.  Comparing Predictive Accuracy , 1994, Business Cycles.

[4]  Qunli Wu,et al.  A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors. , 2019, The Science of the total environment.

[5]  J. Anuradha,et al.  Environmental air pollution clustering using enhanced ensemble clustering methodology , 2020, Environmental Science and Pollution Research.

[6]  Qi Li,et al.  Artificial neural networks forecasting of PM2.5 pollution using air mass trajectory based geographic model and wavelet transformation , 2015 .

[7]  P. M. Ameer,et al.  An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data , 2017, Comput. Biol. Medicine.

[8]  Ying Wang,et al.  A novel hybrid air quality early-warning system based on phase-space reconstruction and multi-objective optimization: A case study in China , 2020 .

[9]  Yanli Xiao,et al.  Using grey Holt–Winters model to predict the air quality index for cities in China , 2017, Natural Hazards.

[10]  Yunzhen Xu,et al.  Air quality early-warning system for cities in China , 2017 .

[11]  Adrian G Barnett,et al.  The association between lung cancer incidence and ambient air pollution in China: A spatiotemporal analysis. , 2016, Environmental research.

[12]  V. Sivakumar,et al.  Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019) , 2020 .

[13]  Viktor Pocajt,et al.  The significance of periodic parameters for ANN modeling of daily SO2 and NOx concentrations: A case study of Belgrade, Serbia , 2019, Atmospheric Pollution Research.

[14]  Alma Hodzic,et al.  A model inter-comparison study focussing on episodes with elevated PM10 concentrations , 2008 .

[15]  Shaolong Sun,et al.  A secondary-decomposition-ensemble learning paradigm for forecasting PM2.5 concentration , 2018, Atmospheric Pollution Research.

[16]  Marcella Busilacchio,et al.  Recursive neural network model for analysis and forecast of PM10 and PM2.5 , 2017 .

[17]  Ping Jiang,et al.  A combined forecasting structure based on the L1 norm: Application to the air quality. , 2019, Journal of environmental management.

[18]  E. Torres,et al.  ARIMA analysis of the effect of land surface coverage on PM10 concentrations in a high-altitude megacity , 2017 .

[19]  P. Hansen A Test for Superior Predictive Ability , 2005 .

[20]  Jianqiang He,et al.  A novel model for hourly PM2.5 concentration prediction based on CART and EELM. , 2019, The Science of the total environment.

[21]  Bruce Misstear,et al.  Real time air quality forecasting using integrated parametric and non-parametric regression techniques , 2015 .

[22]  Ravinesh C. Deo,et al.  A hybrid air quality early-warning framework: An hourly forecasting model with online sequential extreme learning machines and empirical mode decomposition algorithms. , 2019, The Science of the total environment.

[23]  Heng Li,et al.  A time varying filter approach for empirical mode decomposition , 2017, Signal Process..

[24]  Davor Z Antanasijević,et al.  PM(10) emission forecasting using artificial neural networks and genetic algorithm input variable optimization. , 2013, The Science of the total environment.

[25]  Chao Chen,et al.  Prediction of outdoor PM2.5 concentrations based on a three-stage hybrid neural network model , 2020 .

[26]  Matthew L. Thomas,et al.  Estimates and 25-year trends of the global burden of disease attributable to ambient air pollution: an analysis of data from the Global Burden of Diseases Study 2015 , 2017, The Lancet.

[27]  Xin Ma,et al.  An innovative hybrid model based on outlier detection and correction algorithm and heuristic intelligent optimization algorithm for daily air quality index forecasting. , 2019, Journal of environmental management.

[28]  Jianzhou Wang,et al.  A hybrid forecasting system based on a dual decomposition strategy and multi-objective optimization for electricity price forecasting , 2019, Applied Energy.

[29]  Gabriela Polezer,et al.  Assessing the impact of PM2.5 on respiratory disease using artificial neural networks. , 2018, Environmental pollution.

[30]  Yu Jin,et al.  The early-warning system based on hybrid optimization algorithm and fuzzy synthetic evaluation model , 2018, Inf. Sci..

[31]  Chao Chen,et al.  A hybrid multi-resolution multi-objective ensemble model and its application for forecasting of daily PM2.5 concentrations , 2020, Inf. Sci..

[32]  Qiang Zhang,et al.  Process analysis of PM2.5 pollution events in a coastal city of China using CMAQ. , 2019, Journal of environmental sciences.

[33]  Yufang Wang,et al.  A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting , 2016 .

[34]  Zhongyi Hu,et al.  Beyond One-Step-Ahead Forecasting: Evaluation of Alternative Multi-Step-Ahead Forecasting Models for Crude Oil Prices , 2013, ArXiv.

[35]  G. Goudarzi,et al.  Long-term effects of outdoor air pollution on mortality and morbidity–prediction using nonlinear autoregressive and artificial neural networks models , 2020 .

[36]  I. Kyriakou,et al.  Forecasting crude oil and refined products volatilities and correlations: New evidence from fractionally integrated multivariate GARCH models , 2020, Energy Economics.

[37]  Binxu Zhai,et al.  Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China. , 2018, The Science of the total environment.

[38]  Chen Wang,et al.  Research and application of the hybrid forecasting model based on secondary denoising and multi-objective optimization for air pollution early warning system , 2019, Journal of Cleaner Production.

[39]  Junlin An,et al.  Prediction of ozone hourly concentrations by support vector machine and kernel extreme learning machine using wavelet transformation and partial least squares methods , 2020 .

[40]  Wei Sun,et al.  Prediction of ozone levels using a Hidden Markov Model (HMM) with Gamma distribution , 2012 .

[41]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[42]  F. Chen,et al.  A hybrid-wavelet model applied for forecasting PM2.5 concentrations in Taiyuan city, China , 2019, Atmospheric Pollution Research.

[43]  Sergey L. Napelenok,et al.  A direct sensitivity approach to predict hourly ozone resulting from compliance with the National Ambient Air Quality Standard. , 2013, Environmental science & technology.

[44]  M. Vanitha,et al.  Ambient air quality assessment using ensemble techniques , 2021, Soft Comput..

[45]  Jean-Michel Poggi,et al.  PM10 forecasting using clusterwise regression , 2011 .

[46]  Filippo Sorbello,et al.  Three hours ahead prevision of SO2 pollutant concentration using an Elman neural based forecaster , 2008 .

[47]  Yan Feng,et al.  Air Pollution, Greenhouse Gases and Climate Change: Global and Regional Perspectives , 2009 .

[48]  Hongmin Li,et al.  A novel dynamic ensemble air quality index forecasting system , 2020 .

[49]  Ponnuthurai Nagaratnam Suganthan,et al.  Empirical Mode Decomposition based ensemble deep learning for load demand time series forecasting , 2017, Appl. Soft Comput..

[50]  Konstantinos Demertzis,et al.  Hybrid Soft Computing for Atmospheric Pollution-Climate Change Data Mining , 2018, Trans. Comput. Collect. Intell..

[51]  Dong Liu,et al.  Length-Changeable Incremental Extreme Learning Machine , 2017, Journal of Computer Science and Technology.

[52]  Li Yang,et al.  Using fuzzy theory and information entropy for water quality assessment in Three Gorges region, China , 2010, Expert Syst. Appl..

[53]  Maria Val Martin,et al.  Impacts of ozone air pollution and temperature extremes on crop yields: Spatial variability, adaptation and implications for future food security , 2017 .

[54]  Zhenhong Du,et al.  Red tide time series forecasting by combining ARIMA and deep belief network , 2017, Knowl. Based Syst..

[55]  Qiang Zhang,et al.  Source apportionment of PM2.5 across China using LOTOS-EUROS , 2017 .

[56]  Samuel Asumadu Sarkodie,et al.  Global effect of city-to-city air pollution, health conditions, climatic & socio-economic factors on COVID-19 pandemic , 2021, Science of The Total Environment.

[57]  N. Huang,et al.  The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis , 1998, Proceedings of the Royal Society of London. Series A: Mathematical, Physical and Engineering Sciences.

[58]  Jun Ma,et al.  Deep learning-based PM2.5 prediction considering the spatiotemporal correlations: A case study of Beijing, China. , 2020, The Science of the total environment.