Data-driven models to forecast PM10 concentration

The research activity described in this paper concerns the study of the phenomena responsible for the urban and suburban air pollution. The analysis carries on the work already developed by the NeMeFo (neural meteo forecasting) research project for meteorological data short-term forecasting. The study analyzed the air pollution principal causes and identified the best subset of features (meteorological data and air pollutants concentrations) for each air pollutant in order to predict its medium-term concentration (in particular for the particulate matter with an aerodynamic diameter of up to 10 mum called PM10). The selection of the best subset of features was implemented by means of a backward selection algorithm which is based on the information theory notion of relative entropy. The final aim of the research is the implementation of a prognostic tool able to reduce the risk for the air pollutants concentrations to be above the alarm thresholds fixed by the law. The implementation of this tool will be carried out using data-driven models based on some of the most wide-spread statistical data-learning techniques (artificial neural networks and support vector machines).

[1]  F. Benvenuto,et al.  NEURAL NETWORKS FOR ENVIRONMENTAL PROBLEMS: DATA QUALITY CONTROL AND AIR POLLUTION NOWCASTING , 2000 .

[2]  Eros Gian Alessandro Pasero,et al.  INFO: an artificial neural system to forecast ice formation on the road , 2003, The 3rd International Workshop on Scientific Use of Submarine Cables and Related Technologies, 2003..

[3]  Vladimir Cherkassky,et al.  The Nature Of Statistical Learning Theory , 1997, IEEE Trans. Neural Networks.

[4]  P. Werbos,et al.  Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .

[5]  Isabelle Guyon,et al.  An Introduction to Variable and Feature Selection , 2003, J. Mach. Learn. Res..

[6]  D. Marquardt An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .

[7]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[8]  Alessandro Marani,et al.  Neural Networks for Data Quality Control and Air Pollution Nowcasting , 2001 .

[9]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[10]  Eros Gian Alessandro Pasero,et al.  NEMEFO:Neural MEteorological FOrecast , 2004 .

[11]  Marija Zlata Božnar,et al.  NEURAL NETWORKS BASED OZONE FORECASTING , 2004 .

[12]  Jorge Reyes,et al.  Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile , 2000 .

[13]  Daphne Koller,et al.  Toward Optimal Feature Selection , 1996, ICML.

[14]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[15]  R. Fletcher Practical Methods of Optimization , 1988 .

[16]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[17]  Stéphane Canu,et al.  OZONE PEAK AND POLLUTION FORECASTING USING SUPPORT VECTORS , 2001 .