Spatial and meteorological relevance in NO2 estimations: a case study in the Bay of Algeciras (Spain)

This study focuses on how to determine the most relevant variables in order to estimate the hourly NO2 concentrations in a monitoring network located in the Bay of Algeciras (Spain). For each station of the network, artificial neural networks and multiple linear regression have been used to compute hourly estimation models. Meteorological variables and hourly NO2 concentrations from the nearby stations have been used as inputs, and a feature selection procedure has been applied as a previous step. The different models developed have been statistically compared. The inputs used in the best estimation model for each station were the most important to estimate each hourly NO2 concentration level. These estimations can be a very useful resource to provide autonomous capacities as automatic decalibration detection or missing data imputation in monitoring networks. Finally, the similarities between stations, according to the relevance of variables, have been analysed with the aid of a hierarchical clustering algorithm.

[1]  A. Tamhane,et al.  Multiple Comparison Procedures , 2009 .

[2]  Kai Zhang,et al.  Air pollution and health risks due to vehicle traffic. , 2013, The Science of the total environment.

[3]  H. Elminir Dependence of urban air pollutants on meteorology. , 2005, The Science of the total environment.

[4]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[5]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[6]  I. Turias,et al.  Prediction models of CO, SPM and SO2 concentrations in the Campo de Gibraltar Region, Spain: a multiple comparison strategy , 2008, Environmental monitoring and assessment.

[7]  Mohd Talib Latif,et al.  An assessment of influence of meteorological factors on PM10 and NO2 at selected stations in Malaysia , 2012 .

[8]  David C. Carslaw,et al.  Analysis of air quality within a street canyon using statistical and dispersion modelling techniques , 2007 .

[9]  Gabriel Ibarra-Berastegi,et al.  Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area , 2006, Environ. Model. Softw..

[10]  E. Sabah,et al.  Statistical Analysis of Air Pollutants and Meteorological Parameters in Afyon, Turkey , 2009 .

[11]  A. Tamhane,et al.  Multiple Comparison Procedures , 1989 .

[12]  H. Zheng,et al.  Feature selection for high dimensional data in astronomy , 2007, 0709.0138.

[13]  Qi Ying,et al.  Relationships between meteorological parameters and criteria air pollutants in three megacities in China. , 2015, Environmental research.

[14]  Gavin C. Cawley,et al.  Extensive evaluation of neural network models for the prediction of NO2 and PM10 concentrations, compared with a deterministic modelling system and measurements in central Helsinki , 2003 .

[15]  Josefine Gibson,et al.  Air pollution, climate change, and health. , 2015, The Lancet. Oncology.

[16]  Zifa Wang,et al.  The air-borne particulate pollution in Beijing—concentration, composition, distribution and sources , 2004 .

[17]  Dawei Han,et al.  Practical hydroinformatics: Computational intelligence and technological developments in water applications , 2008 .

[18]  K. P. Singh,et al.  Support vector machines in water quality management. , 2011, Analytica chimica acta.

[19]  F. J. Trujillo,et al.  Prediction of PM10 and SO2 exceedances to control air pollution in the Bay of Algeciras, Spain , 2014, Stochastic Environmental Research and Risk Assessment.

[20]  A. Roli Artificial Neural Networks , 2012, Lecture Notes in Computer Science.

[21]  Ari Karppinen,et al.  Evaluation of a multiple regression model for the forecasting of the concentrations of NOx and PM10 in Athens and Helsinki. , 2011, The Science of the total environment.

[22]  Y. Yao,et al.  On Early Stopping in Gradient Descent Learning , 2007 .

[23]  J W Clark,et al.  Neural network modelling , 1991, Physics in medicine and biology.

[24]  Fionn Murtagh,et al.  A Survey of Recent Advances in Hierarchical Clustering Algorithms , 1983, Comput. J..

[25]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[26]  James N. Pitts,et al.  Chemistry of the Upper and Lower Atmosphere: Theory, Experiments, and Applications , 1999 .

[27]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[28]  Chunsheng Zhao,et al.  Characteristics of pollutants and their correlation to meteorological conditions at a suburban site in the North China Plain , 2011 .

[29]  Na Liu,et al.  Numerical model-based relationship between meteorological conditions and air quality and its implication for urban air quality management , 2013 .

[30]  R. Srivastava,et al.  Evaluation of environmental impacts of Integrated Industrial Estate—Pantnagar through application of air and water quality indices , 2011, Environmental monitoring and assessment.

[31]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[32]  Francesca Odone,et al.  Feature selection for high-dimensional data , 2009, Comput. Manag. Sci..

[33]  Asha B. Chelani,et al.  Prediction of sulphur dioxide concentration using artificial neural networks , 2002, Environ. Model. Softw..

[34]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[35]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Wei Wang,et al.  Characteristics of gaseous pollutants near a main traffic line in Beijing and its influencing factors , 2009 .

[37]  Qi Ying,et al.  Investigation of relationships between meteorological conditions and high PM10 pollution in a megacity in the western Yangtze River Delta, China , 2017, Air Quality, Atmosphere & Health.

[38]  Jingjing Xie,et al.  Air pollutants concentrations forecasting using back propagation neural network based on wavelet decomposition with meteorological conditions , 2016 .

[39]  María Manuela,et al.  Modelado de alta resolución para el estudio de la respuesta oceánica al forzamiento del viento en el Estrecho de Gibraltar , 2015 .

[40]  Jianming Xu,et al.  Impact of meteorological conditions on a nine-day particulate matter pollution event observed in December 2013, Shanghai, China , 2015 .

[41]  B. Vijay Bhaskar,et al.  Atmospheric Particulate Pollutants and their Relationship with Meteorology in Ahmedabad , 2010 .

[42]  Jacek M. Leski,et al.  Hierarchical clustering with planar segments as prototypes , 2015, Pattern Recognit. Lett..

[43]  Ignacio J. Turias,et al.  PREDICTION OF CARBON MONOXIDE (CO) ATMOSPHERIC POLLUTION CONCENTRATIONS USING METEROLOGICAL VARIABLES , 2017 .

[44]  M. W Gardner,et al.  Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences , 1998 .

[45]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[46]  I. G. Galeev,et al.  Characteristics of gaseous discharges in micronozzles , 1992 .

[47]  M. Kolehmainen,et al.  Neural networks and periodic components used in air quality forecasting , 2001 .

[48]  W. J. Atkinson Butterfield The analysis of air , 1909 .

[49]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[50]  Pedro Larrañaga,et al.  A review of feature selection techniques in bioinformatics , 2007, Bioinform..

[51]  Claudia Rivera,et al.  Spatial distribution and transport patterns of NO2 in the Tijuana - San Diego area , 2015 .

[52]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[53]  Murat Topal,et al.  Investigation of relationships between removals of tetracycline and , 2016 .

[54]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[55]  Sohail Asghar,et al.  A REVIEW OF FEATURE SELECTION TECHNIQUES IN STRUCTURE LEARNING , 2013 .

[56]  M. Gardner,et al.  Neural network modelling and prediction of hourly NOx and NO2 concentrations in urban air in London , 1999 .

[57]  Mohamed Tarek Khadir,et al.  Impact of clustered meteorological parameters on air pollutants concentrations in the region of Annaba, Algeria , 2012 .

[58]  Roy M. Harrison,et al.  Regression modelling of hourly NOx and NO2 concentrations in urban air in London , 1997 .

[59]  J M Gorriz,et al.  Prediction of CO maximum ground level concentrations in the Bay of Algeciras, Spain using artificial neural networks. , 2008, Chemosphere.

[60]  Warren S. Sarle,et al.  Stopped Training and Other Remedies for Overfitting , 1995 .

[61]  E. Toçi,et al.  Effects of air pollution on children’s pulmonary health , 2010 .

[62]  Robert Tibshirani,et al.  Hierarchical Clustering With Prototypes via Minimax Linkage , 2011, Journal of the American Statistical Association.

[63]  J. Santamaría,et al.  Ambient air levels of volatile organic compounds (VOC) and nitrogen dioxide (NO2) in a medium size city in Northern Spain. , 2008, The Science of the total environment.

[64]  C. Willmott Some Comments on the Evaluation of Model Performance , 1982 .

[65]  Christos Zerefos,et al.  Benzene, toluene, ozone, NO2 and SO2 measurements in an urban street canyon in Thessaloniki, Greece , 2002 .

[66]  Dimitri P. Solomatine,et al.  Data-Driven Modelling: Concepts, Approaches and Experiences , 2009 .

[67]  J Mullol,et al.  Air pollution and allergens. , 2007, Journal of investigational allergology & clinical immunology.

[68]  Pedro G. Lind,et al.  Neural network forecast of daily pollution concentration using optimal meteorological data at synoptic and local scales , 2015 .