A Framework to Predict High-Resolution Spatiotemporal PM2.5 Distributions Using a Deep-Learning Model: A Case Study of Shijiazhuang, China

Air-borne particulate matter, PM2.5 (PM having a diameter of less than 2.5 micrometers), has aroused widespread concern and is a core indicator of severe air pollution in many cities globally. In our study, we present a validated framework to predict the daily PM2.5 distributions, exemplified by a use case of Shijiazhuang City, China, based on daily aerosol optical depth (AOD) datasets. The framework involves obtaining the high-resolution spatiotemporal AOD distributions, estimation of the spatial distributions of PM2.5 and the prediction of these based on a convolutional long short-term memory (ConvLSTM) model. In the estimation part, the eXtreme gradient boosting (XGBoost) model has been determined as the estimation model with the lowest root mean square error (RMSE) of 32.86 µg/m3 and the highest coefficient of determination regression score function (R2) of 0.71, compared to other common models used as a baseline for comparison (linear, ridge, least absolute shrinkage and selection operator (LASSO) and cubist). For the prediction part, after validation and comparison with a seasonal autoregressive integrated moving average (SARIMA), which is a traditional time-series prediction model, in both time and space, the ConvLSTM gives a more accurate performance for the prediction, with a total average prediction RMSE of 14.94 µg/m3 compared to SARIMA’s 17.41 µg/m3. Furthermore, ConvLSTM is more stable and with less fluctuations for the prediction of PM2.5 in time, and it can also eliminate better the spatial predicted errors compared to SARIMA.

[1]  Edward G. Barrett,et al.  Health Effects of Inhaled Gasoline Engine Emissions , 2007, Inhalation toxicology.

[2]  Hui Zhang,et al.  Chemical Characteristics of PM2.5 during a 2016 Winter Haze Episode in Shijiazhuang, China , 2017 .

[3]  Bing Xue,et al.  Short period PM2.5 prediction based on multivariate linear regression model , 2018, PloS one.

[4]  Delong Zhao,et al.  Effects of meteorology and secondary particle formation on visibility during heavy haze events in Beijing, China. , 2015, The Science of the total environment.

[5]  Martin Kappas,et al.  Comparison of Multiple Linear Regression, Cubist Regression, and Random Forest Algorithms to Estimate Daily Air Surface Temperature from Dynamic Combinations of MODIS LST Data , 2017, Remote. Sens..

[6]  Yang Liu,et al.  Estimating Regional Spatial and Temporal Variability of PM2.5 Concentrations Using Satellite Data, Meteorology, and Land Use Information , 2009, Environmental health perspectives.

[7]  Shih-Chun Candice Lung,et al.  Land-use regression with long-term satellite-based greenness index and culture-specific sources to model PM2.5 spatial-temporal variability. , 2017, Environmental pollution.

[8]  Ari Karppinen,et al.  Meteorological dependence of size-fractionated number concentrations of urban aerosol particles , 2006 .

[9]  Stefan Poslad,et al.  A Method for the Estimation of Finely-Grained Temporal Spatial Human Population Density Distributions Based on Cell Phone Call Detail Records , 2020, Remote. Sens..

[10]  Yang Liu,et al.  Spatiotemporal associations between GOES aerosol optical depth retrievals and ground-level PM2.5. , 2008, Environmental science & technology.

[11]  Christos Zerefos,et al.  Forecasting peak pollutant levels from meteorological variables , 1995 .

[12]  Stefan Poslad,et al.  A better way to monitor haze through image based upon the adjusted LeNet-5 CNN model , 2019, Signal, Image and Video Processing.

[13]  Philip Demokritou,et al.  Measurements of PM10 and PM2.5 particle concentrations in Athens, Greece , 2003 .

[14]  Yongming Xu,et al.  Evaluation of machine learning techniques with multiple remote sensing datasets in estimating monthly concentrations of ground-level PM2.5. , 2018, Environmental pollution.

[15]  D. F. Watson,et al.  A PRECISE METHOD FOR DETERMINING CONTOURED SURFACES , 1982 .

[16]  Jiaguo Qi,et al.  Grassland canopy cover and aboveground biomass in Mongolia and Inner Mongolia: Spatiotemporal estimates and controlling factors , 2018, Remote Sensing of Environment.

[17]  Wei Sun,et al.  Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm. , 2017, Journal of environmental management.

[18]  Peng Wang,et al.  A Time-Distributed Spatiotemporal Feature Learning Method for Machine Health Monitoring with Multi-Sensor Time Series , 2018, Sensors.

[19]  Yves Rybarczyk,et al.  Contrasted Effects of Relative Humidity and Precipitation on Urban PM2.5 Pollution in High Elevation Urban Areas , 2018, Sustainability.

[20]  Yang Liu,et al.  Estimating ground-level PM2.5 in China using satellite remote sensing. , 2014, Environmental science & technology.

[21]  Bin Chen,et al.  Real-Time Estimation of Population Exposure to PM2.5 Using Mobile- and Station-Based Big Data , 2018, International journal of environmental research and public health.

[22]  Xing Yu,et al.  Analysis of variation trends and causes of aerosol optical depth in Shaanxi Province using MODIS data , 2013 .

[23]  Yegang Chen Prediction algorithm of PM2.5 mass concentration based on adaptive BP neural network , 2018, Computing.

[24]  Stefan Poslad,et al.  Large-Scale, Fine-Grained, Spatial, and Temporal Analysis, and Prediction of Mobile Phone Users’ Distributions Based upon a Convolution Long Short-Term Model , 2019, Sensors.

[25]  George E. P. Box,et al.  Time Series Analysis: Forecasting and Control , 1977 .

[26]  Guangyuan Zhang,et al.  Critical Review of Methods to Estimate PM2.5 Concentrations within Specified Research Region , 2018, ISPRS Int. J. Geo Inf..

[27]  Yu Hwa-Lung,et al.  Retrospective prediction of intraurban spatiotemporal distribution of PM2.5 in Taipei , 2010 .

[28]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[29]  R. Martin,et al.  Estimating ground-level PM2.5 using aerosol optical depth determined from satellite remote sensing , 2006 .

[30]  A. Roth,et al.  The shuttle radar topography mission—a new class of digital elevation models acquired by spaceborne radar , 2003 .

[31]  Ruqiang Yan,et al.  Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks , 2017, Sensors.

[32]  T. Farr,et al.  Shuttle radar topography mission produces a wealth of data , 2000 .

[33]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[34]  D. J. Lary,et al.  Using Machine Learning to Estimate Global PM2.5 for Environmental Health Studies , 2015, Environmental health insights.

[35]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[36]  Yu Gao,et al.  Estimation of Ground PM2.5 Concentrations using a DEM-assisted Information Diffusion Algorithm: A Case Study in China , 2017, Scientific Reports.

[37]  Roy M. Harrison,et al.  Sources and processes affecting concentrations of PM10 and PM2.5 particulate matter in Birmingham (U.K.) , 1997 .

[38]  Jie Tian,et al.  A semi-empirical model for predicting hourly ground-level fine particulate matter (PM2.5) concentration in southern Ontario from satellite remote sensing and ground-based meteorological measurements , 2010 .

[39]  Douglas W Dockery,et al.  Health effects of particulate air pollution. , 2009, Annals of epidemiology.

[40]  Qi Ying,et al.  Predicting primary PM2.5 and PM0.1 trace composition for epidemiological studies in California. , 2014, Environmental science & technology.

[41]  Yuming Guo,et al.  Estimating PM2.5 concentrations based on non-linear exposure-lag-response associations with aerosol optical depth and meteorological measures , 2018 .

[42]  Jingfeng Huang,et al.  A satellite-based geographically weighted regression model for regional PM2.5 estimation over the Pearl River Delta region in China , 2014 .

[43]  Jun Wang,et al.  Opposite seasonality of the aerosol optical depth and the surface particulate matter concentration over the north China Plain , 2016 .

[44]  Vicki Stone,et al.  Reduced alveolar macrophage migration induced by acute ambient particle (PM10) exposure , 2008, Cell Biology and Toxicology.

[45]  Danial Jahed Armaghani,et al.  Random Forests and Cubist Algorithms for Predicting Shear Strengths of Rockfill Materials , 2019, Applied Sciences.

[46]  Xinyuan Feng,et al.  Influence of different weather events on concentrations of particulate matter with different sizes in Lanzhou, China. , 2012, Journal of environmental sciences.

[47]  Jun Wang,et al.  Intercomparison between satellite‐derived aerosol optical thickness and PM2.5 mass: Implications for air quality studies , 2003 .

[48]  Matthew F. McCabe,et al.  A hybrid training approach for leaf area index estimation via Cubist and random forests machine-learning , 2018 .

[49]  Luca Delle Monache,et al.  Improving NOAA NAQFC PM2.5 predictions with a bias correction approach , 2017 .

[50]  P. Irannejad,et al.  Impact of the El Niño–Southern Oscillation on the climate of Iran using ERA-Interim data , 2018, Climate Dynamics.

[51]  R. Colvile,et al.  Fine particle (PM2.5) personal exposure levels in transport microenvironments, London, UK. , 2001, The Science of the total environment.

[52]  Liangfu Chen,et al.  A study of urban pollution and haze clouds over northern China during the dusty season based on satellite and surface observations , 2014 .

[53]  Lionel Jarlan,et al.  Assimilation of SPOT/VEGETATION NDVI data into a sahelian vegetation dynamics model , 2008 .

[54]  R. Lasaponara On the use of principal component analysis (PCA) for evaluating interannual vegetation anomalies from SPOT/VEGETATION NDVI temporal series , 2006 .

[55]  D. Broday,et al.  Improved retrieval of PM2.5 from satellite data products using non-linear methods. , 2013, Environmental pollution.

[56]  D. Watson A refinement of inverse distance weighted interpolation , 1985 .

[57]  Ping Jiang,et al.  A novel hybrid strategy for PM2.5 concentration analysis and prediction. , 2017, Journal of environmental management.

[58]  R. Koelemeijer,et al.  Comparison of spatial and temporal variations of aerosol optical thickness and particulate matter over Europe , 2006 .

[59]  Xiangqian Wang,et al.  The Construction and Application of an Aledo-NDVI Based Desertification Monitoring Model , 2011 .