Applying Machine Learning to Weather and Pollution Data Analysis for a Better Management of Local Areas: The Case of Napoli, Italy

Local pollution is a problem that affects urban areas and has effects on the quality of life and on health conditions. In order to not develop strict measures and to better manage territories, the national authorities have applied a vast range of predictive models. Actually, the application of machine learning has been studied in the last decades in various cases with various declination to simplify this problem. In this paper, we apply a regression-based analysis technique to a dataset containing official historical local pollution and weather data to look for criteria that allow forecasting critical conditions. The methods was applied to the case study of Napoli, Italy, where the local environmental protection agency manages a set of fixed monitoring stations where both chemical and meteorological data are recorded. The joining of the two raw dataset was overcome by the use of a maximum inclusion strategy as performing the joining action with ”outer” mode. Among the four different regression models applied, namely the Linear Regression Model calculated with Ordinary Least Square (LN-OLS), the Ridge regression Model (Ridge), the Lasso Model (Lasso) and Supervised Nearest Neighbors Regression (KNN), the Ridge regression model was found to better perform with an R2 (Coefficient of Determination) value equal to 0.77 and low value for both MAE (Mean Absolute Error) and MSE (Mean Squared Error), equal to 0.12 and 0.04 respectively.

[1]  J. Scott Armstrong,et al.  Extrapolation for Time-Series and Cross-Sectional Data , 2009 .

[2]  Rajasekar Mohan,et al.  Air Quality Forecasting using LSTM RNN and Wireless Sensor Networks , 2020, ANT/EDI40.

[3]  Craig B. Borkowf,et al.  Time-Series Forecasting , 2002, Technometrics.

[4]  Mauro Iacono,et al.  A WSN Energy-aware Approach for Air Pollution Monitoring in Waste Treatment Facility Site: A Case Study for Landfill Monitoring Odour , 2020, IoTBDS.

[5]  Ferhat Karaca,et al.  An online air pollution forecasting system using neural networks. , 2008, Environment international.

[6]  Sumit Sharma,et al.  Air quality forecasting using artificial neural networks with real time dynamic error correction in highly polluted regions. , 2020, The Science of the total environment.

[7]  Li-Chiu Chang,et al.  Explore a deep learning multi-output neural network for regional multi-step-ahead air quality forecasts , 2019, Journal of Cleaner Production.

[8]  Lidia Contreras Ochando,et al.  Wind-sensitive Interpolation of Urban Air Pollution Forecasts , 2016, ICCS.

[9]  Haytham Elghazel,et al.  A machine-learning framework for predicting multiple air pollutants' concentrations via multi-target regression and feature selection. , 2020, The Science of the total environment.

[10]  Weitian Tong Machine learning for spatiotemporal big data in air pollution , 2020 .

[11]  B. J. Sowmya,et al.  Chapter Eight - Air pollution control model using machine learning and IoT techniques , 2020, Adv. Comput..

[12]  Zahari Zlatev,et al.  Operational air pollution forecasts from European to local scale , 2001 .

[13]  D. Domanska,et al.  Explorative forecasting of air pollution , 2014 .

[14]  Hui Liu,et al.  Intelligent modeling strategies for forecasting air quality time series: A review , 2021, Appl. Soft Comput..

[15]  Adil Masood,et al.  A model for particulate matter (PM2.5) prediction for Delhi based on machine learning approaches , 2020 .

[16]  Ionel M. Navon,et al.  Machine learning-based rapid response tools for regional air pollution modelling , 2019, Atmospheric Environment.

[17]  Jakub Marecek,et al.  Using deep learning to extend the range of air pollution monitoring and forecasting , 2020, J. Comput. Phys..

[18]  A. Baklanov,et al.  Advances in air quality modeling and forecasting , 2020 .