A novel hourly PM2.5 concentration prediction model based on feature selection, training set screening, and mode decomposition-reorganization

Abstract Accurate prediction of PM2.5 and other air pollutants concentration can provide early warning information for sustainable urban pollution control, urban construction and travel planning. In this paper, combined with feature selection, training set selection, mode decomposition and reorganization, machine learning, a new PM2.5 concentration hybrid prediction model is established. Firstly, historical data were screened by random forest (RF) and grey system approximation model (GSA). Secondly, the processed data is decomposed by time varying filtering based empirical mode decomposition (TVFEMD). Then, the extreme learning machine (ELM) optimized by moth flame optimization algorithm (MFO) is used for prediction. Based on the data of four cities in Beijing Tianjin Hebei region, the following conclusions can be drawn: (1) The effectiveness and robustness of the proposed model are verified, and the evaluation indexes are the best. (2) RF-GSA can effectively improve the quality of training set. (3) Mode decomposition and reorganization can effectively improve the prediction accuracy. The model can provide a reference for government policy-making and residents travel.

[1]  John P. Wilson,et al.  The changing PM2.5 dynamics of global megacities based on long-term remotely sensed observations. , 2020, Environment international.

[2]  John H. Seinfeld,et al.  Inverse modeling and mapping US air quality influences of inorganic PM 2.5 precursor emissions using the adjoint of GEOS-Chem , 2008 .

[3]  Yuan Zhao,et al.  A new prediction method based on VMD-PRBF-ARMA-E model considering wind speed characteristic , 2020 .

[4]  Heng Li,et al.  A time varying filter approach for empirical mode decomposition , 2017, Signal Process..

[5]  Jianzhou Wang,et al.  A hybrid model for PM₂.₅ forecasting based on ensemble empirical mode decomposition and a general regression neural network. , 2014, The Science of the total environment.

[6]  Ming Zhang,et al.  Exploring the effect of economic and environment factors on PM2.5 concentration: A case study of the Beijing-Tianjin-Hebei region. , 2020, Journal of environmental management.

[7]  Ching-Lai Hwang,et al.  Methods for Multiple Attribute Decision Making , 1981 .

[8]  Daniel Slottje,et al.  Economic inequality and the environment , 2001, Environ. Model. Softw..

[9]  Yuanyuan Wang,et al.  Daily air quality index forecasting with hybrid models: A case in China. , 2017, Environmental pollution.

[10]  Wei-jun Cao,et al.  PM2.5 compromises antiviral immunity in influenza infection by inhibiting activation of NLRP3 inflammasome and expression of interferon-β. , 2020, Molecular immunology.

[11]  M. Greenstone,et al.  Evidence on the impact of sustained exposure to air pollution on life expectancy from China’s Huai River policy , 2013, Proceedings of the National Academy of Sciences.

[12]  Junlin An,et al.  Prediction of ozone hourly concentrations by support vector machine and kernel extreme learning machine using wavelet transformation and partial least squares methods , 2020 .

[13]  Shadi Ausati,et al.  Assessing the accuracy of ANFIS, EEMD-GRNN, PCR, and MLR models in predicting PM 2.5 , 2016 .

[14]  Congcong Wen,et al.  A novel spatiotemporal convolutional long short-term neural network for air pollution prediction. , 2019, The Science of the total environment.

[15]  S. Jeon,et al.  Understanding global PM2.5 concentrations and their drivers in recent decades (1998-2016). , 2020, Environment international.

[16]  Weijie Zhou,et al.  Predictive analysis of the air quality indicators in the Yangtze River Delta in China: An application of a novel seasonal grey model. , 2020, The Science of the total environment.

[17]  Wenqiang Liu,et al.  Pre-pregnancy exposure to fine particulate matter (PM2.5) increases reactive oxygen species production in oocytes and decrease litter size and weight in mice. , 2020, Environmental pollution.

[18]  Chen Jie,et al.  Biphasic fish collagen scaffold for osteochondral regeneration , 2020 .

[19]  C. Ou,et al.  Nonlinear and lagged meteorological effects on daily levels of ambient PM2.5 and O3: Evidence from 284 Chinese cities , 2021 .

[20]  L. Zou,et al.  A study on daily PM2.5 concentrations in Hong Kong using the EMD-based MFDFA method , 2019, Physica A: Statistical Mechanics and its Applications.

[21]  Jianzhou Wang,et al.  A novel hybrid model based on multi-objective Harris hawks optimization algorithm for daily PM2.5 and PM10 forecasting , 2019, Appl. Soft Comput..

[22]  Wei Sun,et al.  A hybrid air pollutant concentration prediction model combining secondary decomposition and sequence reconstruction. , 2020, Environmental pollution.

[23]  Tao Li,et al.  Cost-sensitive feature selection using random forest: Selecting low-cost subsets of informative features , 2016, Knowl. Based Syst..

[24]  Lei Zhang,et al.  Investigating the performance of satellite-based models in estimating the surface PM2.5 over China. , 2020, Chemosphere.

[25]  Biswajit Mishra,et al.  IoT-enabled Low Power Environment Monitoring System for prediction of PM2.5 , 2020, Pervasive Mob. Comput..

[26]  Li-Chiu Chang,et al.  Multi-output support vector machine for regional multi-step-ahead PM2.5 forecasting. , 2019, The Science of the total environment.

[27]  Rolf Müller,et al.  Annual cycle of ozone at and above the tropical tropopause: observations versus simulations with the Chemical Lagrangian Model of the Stratosphere (CLaMS) , 2009 .

[28]  Wenlong Fu,et al.  A blended approach incorporating TVFEMD, PSR, NNCT-based multi-model fusion and hierarchy-based merged optimization algorithm for multi-step wind speed prediction , 2021 .

[29]  Wenxin Zhang,et al.  Responses of PM2.5 pollution to urbanization in China , 2018, Energy Policy.

[30]  Seyed Mohammad Mirjalili,et al.  Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm , 2015, Knowl. Based Syst..

[31]  Ralph Morris,et al.  Photochemical model evaluation of the ground-level ozone impacts on ambient air quality and vegetation health in the Alberta oil sands region: Using present and future emission scenarios , 2016 .

[32]  Lin Wang,et al.  Wind speed forecasting based on variational mode decomposition and improved echo state network , 2021 .

[33]  Guan Guan,et al.  Evaluation method for Green jack-up drilling platform design scheme based on improved grey correlation analysis , 2019, Applied Ocean Research.

[34]  Yuexiong Ding,et al.  A Lag-FLSTM deep learning network based on Bayesian Optimization for multi-sequential-variant PM2.5 prediction , 2020 .

[35]  Dongxiao Niu,et al.  Short-term photovoltaic power generation forecasting based on random forest feature selection and CEEMD: A case study , 2020, Appl. Soft Comput..

[36]  D. Byun,et al.  Review of the Governing Equations, Computational Algorithms, and Other Components of the Models-3 Community Multiscale Air Quality (CMAQ) Modeling System , 2006 .

[37]  Chao Chen,et al.  Spatial air quality index prediction model based on decomposition, adaptive boosting, and three-stage feature selection: A case study in China , 2020, Journal of Cleaner Production.

[38]  Yufang Wang,et al.  A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting , 2016 .

[39]  K. Chau,et al.  Improving forecasting accuracy of medium and long-term runoff using artificial neural network based on EEMD decomposition. , 2015, Environmental research.

[40]  Qunli Wu,et al.  Daily urban air quality index forecasting based on variational mode decomposition, sample entropy and LSTM neural network , 2019, Sustainable Cities and Society.

[41]  Zizheng Guo,et al.  Prediction model oriented for landslide displacement with step-like curve by applying ensemble empirical mode decomposition and the PSO-ELM method , 2020 .

[42]  K. Xiong,et al.  Analysis of the driving factors of PM2.5 concentration in the air: A case study of the Yangtze River Delta, China , 2020 .

[43]  Lifeng Wu,et al.  Forecasting PM2.5 and PM10 concentrations using GMCN(1,N) model with the similar meteorological condition: Case of Shijiazhuang in China , 2020 .

[44]  Yu L Pavlov,et al.  Random Forests , 2000 .

[45]  C. Hwang,et al.  Fuzzy Multiple Objective Decision Making: Methods And Applications , 1996 .

[46]  Wen Zhou,et al.  An evaluation of air quality modeling over the Pearl River Delta during November 2006 , 2012, Meteorology and Atmospheric Physics.

[47]  Wang Xiaotong,et al.  Improved EMD for the analysis of FM signals , 2012 .

[48]  Le Jian,et al.  An application of ARIMA model to predict submicron particle concentrations from meteorological factors at a busy roadside in Hangzhou, China. , 2012, The Science of the total environment.

[49]  S. Managi,et al.  Health-related and non-health-related effects of PM2.5 on life satisfaction: Evidence from India, China and Japan , 2020 .