MapLUR

Land-use regression (LUR) models are important for the assessment of air pollution concentrations in areas without measurement stations. While many such models exist, they often use manually constructed features based on restricted, locally available data. Thus, they are typically hard to reproduce and challenging to adapt to areas beyond those they have been developed for. In this paper, we advocate a paradigm shift for LUR models: We propose the Data-driven, Open, Global (DOG) paradigm that entails models based on purely data-driven approaches using only openly and globally available data. Progress within this paradigm will alleviate the need for experts to adapt models to the local characteristics of the available data sources and thus facilitate the generalizability of air pollution models to new areas on a global scale. In order to illustrate the feasibility of the DOG paradigm for LUR, we introduce a deep learning model called MapLUR. It is based on a convolutional neural network architecture and is trained exclusively on globally and openly available map data without requiring manual feature engineering. We compare our model to state-of-the-art baselines like linear regression, random forests and multi-layer perceptrons using a large data set of modeled $\text{NO}_2$ concentrations in Central London. Our results show that MapLUR significantly outperforms these approaches even though they are provided with manually tailored features. Furthermore, we illustrate that the automatic feature extraction inherent to models based on the DOG paradigm can learn features that are readily interpretable and closely resemble those commonly used in traditional LUR approaches.

[1]  Vikas Singh,et al.  Higher Pollution Episode Detection Using Image Classification Techniques , 2016, Environmental Modeling & Assessment.

[2]  B. Brunekreef,et al.  Land use regression modelling estimating nitrogen oxides exposure in industrial south Durban, South Africa. , 2018, The Science of the total environment.

[3]  Yong Li,et al.  Hourly PM2.5 concentration forecast using stacked autoencoder model with emphasis on seasonality , 2019, Journal of Cleaner Production.

[4]  Hao Wu,et al.  End-to-end learning for image-based air quality level estimation , 2018, Machine Vision and Applications.

[5]  G. Lemasters,et al.  Exposure assessment models for elemental components of particulate matter in an urban environment: A comparison of regression and random forest approaches. , 2017, Atmospheric environment.

[6]  David Morley,et al.  A land use regression variable generation, modelling and prediction tool for air pollution exposure assessment , 2018, Environmental Modelling & Software.

[7]  G. Lemasters,et al.  A Review of Land-use Regression Models for Characterizing Intraurban Air Pollution Exposure , 2007, Inhalation toxicology.

[8]  R. D'Agostino An omnibus test of normality for moderate and large size samples , 1971 .

[9]  Jin Zhang,et al.  An ensemble long short-term memory neural network for hourly PM2.5 concentration forecasting. , 2019, Chemosphere.

[10]  Martina S. Ragettli,et al.  Performance of Multi-City Land Use Regression Models for Nitrogen Dioxide and Fine Particles , 2014, Environmental health perspectives.

[11]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[12]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[13]  Bert Brunekreef,et al.  Satellite NO2 data improve national land use regression models for ambient NO2 in a small densely populated country , 2015 .

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  E. S. Pearson,et al.  Tests for departure from normality. Empirical results for the distributions of b2 and √b1 , 1973 .

[16]  Lothar Thiele,et al.  Pushing the spatio-temporal resolution limit of urban air pollution maps , 2014, 2014 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[17]  Mikhail F. Kanevski,et al.  Air Pollution Mapping Using Nonlinear Land Use Regression Models , 2014, ICCSA.

[18]  Bert Brunekreef,et al.  Development of NO2 and NOx land use regression models for estimating air pollution exposure in 36 study areas in Europe - The ESCAPE project , 2013 .

[19]  B. Brunekreef,et al.  Spatial variation of PM2.5, PM10, PM2.5 absorbance and PMcoarse concentrations between and within 20 European study areas and the relationship with NO2 : results of the ESCAPE project , 2012 .

[20]  Derek M. Elsom,et al.  Atmospheric Pollution: A Global Problem , 1992 .

[21]  Md. Saniul Alam,et al.  Exploring the modeling of spatiotemporal variations in ambient air pollution within the land use regression framework: Estimation of PM10 concentrations on a daily basis , 2015, Journal of the Air & Waste Management Association.

[22]  Yang Li,et al.  Air Pollutant Concentration Forecast Based on Support Vector Regression and Quantum-Behaved Particle Swarm Optimization , 2018, Environmental Modeling & Assessment.

[23]  Sancho Salcedo-Sanz,et al.  Prediction of hourly O3 concentrations using support vector regression algorithms , 2010 .

[24]  Bert Brunekreef,et al.  Development of Land Use Regression models for PM(2.5), PM(2.5) absorbance, PM(10) and PM(coarse) in 20 European study areas; results of the ESCAPE project. , 2012, Environmental science & technology.

[25]  Jure Leskovec,et al.  Image Labeling on a Network: Using Social-Network Metadata for Image Classification , 2012, ECCV.

[26]  Tianqi Chen,et al.  Empirical Evaluation of Rectified Activations in Convolutional Network , 2015, ArXiv.

[27]  Yu Liu,et al.  Autoencoder-based deep belief regression network for air particulate matter concentration forecasting , 2018, Journal of Intelligent & Fuzzy Systems.

[28]  Bert Brunekreef,et al.  Land Use Regression Models for Ultrafine Particles and Black Carbon Based on Short-Term Monitoring Predict Past Spatial Variation. , 2015, Environmental science & technology.

[29]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[30]  David C. Carslaw,et al.  Estimations of road vehicle primary NO2 exhaust emission fractions using monitoring data in London , 2005 .

[31]  Jiebo Luo,et al.  Using user generated online photos to estimate and monitor air pollution in major cities , 2015, ICIMCS '15.

[32]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[33]  Alexander J. Smola,et al.  Support Vector Regression Machines , 1996, NIPS.

[34]  Guangming Zeng,et al.  Land use regression models coupled with meteorology to model spatial and temporal variability of NO2 and PM10 in Changsha, China , 2015 .

[35]  Sepp Hochreiter,et al.  Self-Normalizing Neural Networks , 2017, NIPS.

[36]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[37]  J. Gulliver,et al.  A review of land-use regression models to assess spatial variation of outdoor air pollution , 2008 .

[38]  A. Azzouz 2011 , 2020, City.

[39]  Qi Li,et al.  A Spatiotemporal Prediction Framework for Air Pollution Based on Deep RNN , 2017 .

[40]  Jiansheng Wu,et al.  Applying land use regression model to estimate spatial variation of PM2.5 in Beijing, China , 2015, Environmental Science and Pollution Research.

[41]  M. Adams ADVANCING THE USE OF MOBILE MONITORING DATA FOR AIR POLLUTION MODELLING , 2015 .

[42]  Radu Horaud,et al.  A Comprehensive Analysis of Deep Regression , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[44]  A. James 2010 , 2011, Philo of Alexandria: an Annotated Bibliography 2007-2016.

[45]  Chen Qiu,et al.  Rewarding Coreference Resolvers for Being Consistent with World Knowledge , 2019, EMNLP/IJCNLP.

[46]  Kaiming He,et al.  Exploring the Limits of Weakly Supervised Pretraining , 2018, ECCV.

[47]  A. Buevich,et al.  Modeling of surface dust concentrations using neural networks and kriging , 2016 .

[48]  Yan Zhang,et al.  A land use regression model for estimating the NO2 concentration in Shanghai, China. , 2015, Environmental research.

[49]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[50]  Lei Huang,et al.  Development of land use regression models for PM2.5, SO2, NO2 and O3 in Nanjing, China , 2017, Environmental research.

[51]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[52]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[53]  Alexandra Schneider,et al.  Land use regression modeling of ultrafine particles, ozone, nitrogen oxides and markers of particulate matter pollution in Augsburg, Germany. , 2017, The Science of the total environment.

[54]  M. Green Air pollution and health , 1995 .