A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods.

The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. Moreover, the inclusion of the great spatial heterogeneity characteristic of urban environments seems to be essential in order to achieve an accurate environmental-noise prediction in cities. This problem is addressed in this paper, where a procedure based on feature-selection techniques and machine-learning regression methods is proposed and applied to this environmental problem. Three machine-learning regression methods, which are considered very robust in solving non-linear problems, are used to estimate the energy-equivalent sound-pressure level descriptor (LAeq). These three methods are: (i) multilayer perceptron (MLP), (ii) sequential minimal optimisation (SMO), and (iii) Gaussian processes for regression (GPR). In addition, because of the high number of input variables involved in environmental-noise modelling and estimation in urban environments, which make LAeq prediction models quite complex and costly in terms of time and resources for application to real situations, three different techniques are used to approach feature selection or data reduction. The feature-selection techniques used are: (i) correlation-based feature-subset selection (CFS), (ii) wrapper for feature-subset selection (WFS), and the data reduction technique is principal-component analysis (PCA). The subsequent analysis leads to a proposal of different schemes, depending on the needs regarding data collection and accuracy. The use of WFS as the feature-selection technique with the implementation of SMO or GPR as regression algorithm provides the best LAeq estimation (R(2)=0.94 and mean absolute error (MAE)=1.14-1.16 dB(A)).

[1]  Ignacio Requena,et al.  Priorization of acoustic variables: Environmental decision support for the physical characterization of urban sound environments , 2010 .

[2]  Ian T. Jolliffe,et al.  Principal Component Analysis , 1986, Springer Series in Statistics.

[3]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[4]  S. Anitha,et al.  Application of a radial basis function neural network for diagnosis of diabetes mellitus , 2006 .

[5]  S. Sathiya Keerthi,et al.  Improvements to the SMO algorithm for SVM regression , 2000, IEEE Trans. Neural Networks Learn. Syst..

[6]  Sh Givargis,et al.  A basic neural traffic noise prediction model for Tehran's roads. , 2010, Journal of environmental management.

[7]  A. Atiya,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[8]  Harun Uguz,et al.  A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm , 2011, Knowl. Based Syst..

[9]  Pravin Chandra,et al.  Sigmoidal Function Classes for Feedforward Artificial Neural Networks , 2003, Neural Processing Letters.

[10]  Manu Pratap Singh,et al.  Correlation-based Attribute Selection using Genetic Algorithm , 2010 .

[11]  L. Buydens,et al.  Facilitating the application of Support Vector Regression by using a universal Pearson VII function based kernel , 2006 .

[12]  Zne-Jung Lee,et al.  Hybrid robust support vector machines for regression with outliers , 2011, Appl. Soft Comput..

[13]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[14]  Antonio J. Torija,et al.  Use of back-propagation neural networks to predict both level and temporal-spectral composition of sound pressure in urban sound environments , 2012 .

[15]  Diego P. Ruiz,et al.  Required stabilization time, short-term variability and impulsiveness of the sound pressure level to characterize the temporal composition of urban soundscapes , 2011 .

[16]  B. Jakovljević,et al.  Subjective reactions to traffic noise with regard to some personality traits , 1997 .

[17]  L. A. Smith,et al.  Feature Subset Selection: A Correlation Based Filter Approach , 1997, ICONIP.

[18]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[19]  Pichai Pamanikabud,et al.  Geographical information system for traffic noise analysis and forecasting with the appearance of barriers , 2003, Environ. Model. Softw..

[20]  U. W. Tang,et al.  Influences of urban forms on traffic-induced noise and air pollution: Results from a modelling system , 2007, Environ. Model. Softw..

[21]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[22]  Léa Cristina Lucas de Souza,et al.  Urban indices as environmental noise indicators , 2011, Comput. Environ. Urban Syst..

[23]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[24]  Carl Smith,et al.  An analysis of landscape penetration by road infrastructure and traffic noise , 2012, Comput. Environ. Urban Syst..

[25]  F. Yüksel,et al.  A traffic noise prediction method based on vehicle composition using genetic algorithms , 2005 .

[26]  Xinjun Peng,et al.  TSVR: An efficient Twin Support Vector Machine for regression , 2010, Neural Networks.

[27]  K. Paunović,et al.  Urban road-traffic noise and blood pressure and heart rate in preschool children. , 2008, Environment international.

[28]  Masayuki Morimoto,et al.  TRANSPORTATION NOISE ANNOYANCE—A SIMULATED-ENVIRONMENT STUDY FOR ROAD, RAILWAY AND AIRCRAFT NOISES, PART 1: OVERALL ANNOYANCE , 1999 .

[29]  Geoffrey E. Hinton,et al.  Learning representations by back-propagation errors, nature , 1986 .

[30]  David E. Goldberg,et al.  Genetic Algorithms in Search Optimization and Machine Learning , 1988 .

[31]  Antonio J Torija,et al.  Using recorded sound spectra profile as input data for real-time short-term urban road-traffic-flow estimation. , 2012, The Science of the total environment.

[32]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[33]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[34]  Campbell Steele,et al.  A critical review of some traffic noise prediction models , 2001 .

[35]  Luis Alonso,et al.  Machine learning regression algorithms for biophysical parameter retrieval: Opportunities for Sentinel-2 and -3 , 2012 .

[36]  Oguz Kaynar,et al.  Multiple regression, ANN (RBF, MLP) and ANFIS models for prediction of swell potential of clayey soils , 2010, Expert Syst. Appl..

[37]  Naveen Garg,et al.  A critical review of principal traffic noise models: Strategies and implications , 2014 .

[38]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[39]  Farid Melgani,et al.  Gaussian Process Regression for Estimating Chlorophyll Concentration in Subsurface Waters From Remote Sensing Data , 2010, IEEE Geoscience and Remote Sensing Letters.

[40]  Haleh Vafaie,et al.  Feature Selection Methods: Genetic Algorithms vs. Greedy-like Search , 2009 .

[41]  Gabriel Ibarra-Berastegi,et al.  Regression and multilayer perceptron-based models to forecast hourly O3 and NO2 levels in the Bilbao area , 2006, Environ. Model. Softw..

[42]  S. Qin,et al.  Selection of the Number of Principal Components: The Variance of the Reconstruction Error Criterion with a Comparison to Other Methods† , 1999 .

[43]  Gary William Flake,et al.  Efficient SVM Regression Training with SMO , 2002, Machine Learning.

[44]  C. Achillas,et al.  Measuring combined exposure to environmental pressures in urban areas: an air quality and noise pollution assessment approach. , 2012, Environment international.

[45]  Antonio J. Torija,et al.  A tool for urban soundscape evaluation applying Support Vector Machines for developing a soundscape classification model. , 2014, The Science of the total environment.

[46]  Ying Chen,et al.  A novel traffic-noise prediction method for non-straight roads , 2012 .

[47]  Rob Law,et al.  A sparse Gaussian process regression model for tourism demand forecasting in Hong Kong , 2012, Expert Syst. Appl..

[48]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[49]  I. Jolliffe Principal Component Analysis , 2002 .

[50]  Edgar A. G. Shaw,et al.  Noise environments outdoors and the effects of community noise exposure , 1996 .

[51]  Gabriel Ibarra-Berastegi,et al.  From diagnosis to prognosis for forecasting air pollution using neural networks: Air pollution monitoring in Bilbao , 2008, Environ. Model. Softw..

[52]  Kenneth DeJong,et al.  Learning with genetic algorithms: An overview , 1988, Machine Learning.

[53]  K. De Jong Learning with Genetic Algorithms: An Overview , 1988 .

[54]  W. F. Hofman,et al.  Cardiac reactivity to traffic noise during sleep in man , 1995 .

[55]  Pedro Larrañaga,et al.  Feature subset selection from positive and unlabelled examples , 2009, Pattern Recognit. Lett..

[56]  L. Buydens,et al.  Comparing support vector machines to PLS for spectral regression applications , 2004 .