Estimates of daily ground-level NO2 concentrations in China based on big data and machine learning approaches

Nitrogen dioxide (NO2) is one of the most important atmospheric pollutants. However, current ground-level NO2 concentration data are lack of either high-resolution coverage or full coverage national wide, due to the poor quality of source data and the computing power of the models. To our knowledge, this study is the first to estimate the ground-level NO2 concentration in China with national coverage as well as relatively high spatiotemporal resolution (0.25 degree; daily intervals) over the newest past 6 years (2013-2018). We advanced a Random Forest model integrated K-means (RF-K) for the estimates with multi-source parameters. Besides meteorological parameters, satellite retrievals parameters, we also, for the first time, introduce socio-economic parameters to assess the impact by human activities. The results show that: (1) the RF-K model we developed shows better prediction performance than other models, with cross-validation R2 = 0.64 (MAPE = 34.78%). (2) The annual average concentration of NO2 in China showed a weak increasing trend . While in the economic zones such as Beijing-Tianjin-Hebei region, Yangtze River Delta, and Pearl River Delta, the NO2 concentration there even decreased or remained unchanged, especially in spring. Our dataset has verified that pollutant controlling targets have been achieved in these areas. With mapping daily nationwide ground-level NO2 concentrations, this study provides timely data with high quality for air quality management for China. We provide a universal model framework to quickly generate a timely national atmospheric pollutants concentration map with a high spatial-temporal resolution, based on improved machine learning methods.

[1]  Peng Zhang,et al.  Spatiotemporal variations of tropospheric SO2 over China by SCIAMACHY observations during 2004–2009 , 2012 .

[2]  James A. Mulholland,et al.  A comparison of statistical and machine learning methods for creating national daily maps of ambient PM2.5 concentration. , 2019, Atmospheric environment.

[3]  Bin Zou,et al.  High-Resolution Satellite Mapping of Fine Particulates Based on Geographically Weighted Regression , 2016, IEEE Geoscience and Remote Sensing Letters.

[4]  Zhanqing Li,et al.  Estimating 1-km-resolution PM2.5 concentrations across China using the space-time random forest approach , 2019, Remote Sensing of Environment.

[5]  K. Taylor Summarizing multiple aspects of model performance in a single diagram , 2001 .

[6]  Bert Brunekreef,et al.  Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe - The ESCAPE project , 2013 .

[7]  G. Beig,et al.  Satellite derived trends in NO2 over the major global hotspot regions during the past decade and their inter-comparison. , 2009, Environmental pollution.

[8]  Cole Brokamp,et al.  Predicting Daily Urban Fine Particulate Matter Concentrations Using a Random Forest Model. , 2018, Environmental science & technology.

[9]  Simon Kingham,et al.  Mapping Urban Air Pollution Using GIS: A Regression-Based Approach , 1997, Int. J. Geogr. Inf. Sci..

[10]  M. Shima,et al.  Spatiotemporal land use random forest model for estimating metropolitan NO2 exposure in Japan. , 2018, The Science of the total environment.

[11]  Baofeng Di,et al.  Spatiotemporal prediction of daily ambient ozone levels across China using random forest for human exposure assessment. , 2018, Environmental pollution.

[12]  Zhanqing Li,et al.  Satellite-derived 1-km-resolution PM1 concentrations from 2014 to 2018 across China. , 2019, Environmental science & technology.

[13]  Carlo Lavalle,et al.  Development of European NO2 Land Use Regression Model for present and future exposure assessment: Implications for policy analysis. , 2018, Environmental pollution.

[14]  Yu Zhan,et al.  Spatiotemporal prediction of continuous daily PM2.5 concentrations across China using a spatially explicit machine learning algorithm , 2017 .

[15]  WU Xue-fang,et al.  Application Status of Models-3/CMAQ in Environmental Management , 2013 .

[16]  Martin Charlton,et al.  The Geography of Parameter Space: An Investigation of Spatial Non-Stationarity , 1996, Int. J. Geogr. Inf. Sci..

[17]  Julian D Marshall,et al.  A national satellite-based land-use regression model for air pollution exposure assessment in Australia. , 2014, Environmental research.

[18]  David G. Streets,et al.  Aura OMI observations of regional SO2 and NO2 pollution changes from 2005 to 2015 , 2015 .

[19]  K. Dimakopoulou,et al.  Predicting Fine Particulate Matter (PM2.5) in the Greater London Area: An Ensemble Approach using Machine Learning Methods , 2019, Remote. Sens..

[20]  Lin Sun,et al.  Improved 1 km resolution PM2.5 estimates across China using enhanced space–time extremely randomized trees , 2020 .

[21]  Michael Brauer,et al.  Spatiotemporal air pollution exposure assessment for a Canadian population-based lung cancer case-control study , 2012, Environmental Health.

[22]  G. Pfister,et al.  Spatiotemporal prediction of fine particulate matter during the 2008 northern California wildfires using machine learning. , 2015, Environmental science & technology.

[23]  Matthias Ketzel,et al.  A comparison of linear regression, regularization, and machine learning algorithms to develop Europe-wide spatial models of fine particles and nitrogen dioxide. , 2019, Environment international.

[24]  Zhang Yan,et al.  Case study on the numerical simulations of the characteristics of temporal and spatial distributions of O3 and NO2 in Shanghai Area. , 2009 .

[25]  J. Burrows,et al.  Increase in tropospheric nitrogen dioxide over China observed from space , 2005, Nature.

[26]  Baofeng Di,et al.  Satellite-Based Estimates of Daily NO2 Exposure in China Using Hybrid Random Forest and Spatiotemporal Kriging Model. , 2018, Environmental science & technology.

[27]  D. Jacob,et al.  Mapping annual mean ground‐level PM2.5 concentrations using Multiangle Imaging Spectroradiometer aerosol optical thickness over the contiguous United States , 2004 .

[28]  Xiong Liu,et al.  The effects of rapid urbanization on the levels in tropospheric nitrogen dioxide and ozone over East China , 2013 .

[29]  Randall V. Martin,et al.  Long-Term Trends Worldwide in Ambient NO2 Concentrations Inferred from Satellite Observations , 2015, Environmental health perspectives.

[30]  Yujie Wang,et al.  Assessing PM2.5 Exposures with High Spatiotemporal Resolution across the Continental United States. , 2016, Environmental science & technology.

[31]  J. H. Belle,et al.  Estimating PM2.5 Concentrations in the Conterminous United States Using the Random Forest Approach. , 2017, Environmental science & technology.

[32]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[33]  Xinghong Cheng,et al.  Spatio-temporal variations in SO2 and NO2 emissions caused by heating over the Beijing-Tianjin-Hebei Region constrained by an adaptive nudging method with OMI data. , 2018, The Science of the total environment.