A New Feature Selection Methodology for Environmental Modelling Support: The Case of Thessaloniki Air Quality

Environmental systems status is described via a (usually big) set of parameters. Therefore, relevant models employ a large feature space, thus making feature selection a necessity towards better modelling results. Many methods have been used in order to reduce the number of features, while safeguarding environmental model performance and resulting to low computational time. In this study, a new feature selection methodology is presented, making use of the Self Organizing Maps (SOM) method. SOM visualization values are used as a similarity measure between the parameter that is to be forecasted, and parameters of the feature space. The method leads to the smallest set of parameters that surpass a similarity threshold. Results obtained, for the case of Thessaloniki air quality forecasting, are comparable to what feature selection methods offer.

[1]  J. Kukkonen,et al.  Intercomparison of air quality data using principal component analysis, and forecasting of PM₁₀ and PM₂.₅ concentrations using artificial neural networks, in Thessaloniki and Helsinki. , 2011, The Science of the total environment.

[2]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[3]  Antonio Arauzo-Azofra,et al.  A feature set measure based on Relief , 2004 .

[4]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[5]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[6]  T. Kohonen Self-organized formation of topographically correct feature maps , 1982 .

[7]  James L. McClelland,et al.  Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .

[8]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[9]  S. M. Shiva Nagendra,et al.  Urban air quality management-A review , 2015 .

[10]  N Moussiopoulos,et al.  Air quality status in Greater Thessaloniki Area and the emission reductions needed for attaining the EU air quality legislation. , 2009, The Science of the total environment.

[11]  M. Goldberg,et al.  A Systematic Review of the Relation Between Long-term Exposure to Ambient Air Pollution and Chronic Diseases , 2008, Reviews on environmental health.

[12]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[13]  Shiliang Zhang,et al.  Correlation-Based Feature Selection and Regression , 2010, PCM.

[14]  Luca Mesin,et al.  A Feature Selection Method for Air Quality Forecasting , 2010, ICANN.