Support vector machine and artificial neural network to model soil pollution: a case study in Semnan Province, Iran

AbstractTo study the extent of soil pollution in Shahrood and Damghan located in Semnan Province, Iran, 229 soil samples were taken and the levels of 12 heavy metals (Ag, Co, Pb, Tl, Be, Ni, Cd, Ba, Cu, V, Zn and Cr) were analyzed. Elevated values of some heavy metals such as Cr, Ni and V were detected in the study area. In order to predict soil pollution index (SPI) with respect to the concentration levels of 12 detected heavy metals, support vector machines (SVMs) with different kernels (linear, RBF and polynomial) and artificial neural networks (ANNs) were utilized. The database was repeatedly randomly split into training and testing data sets, and both SVMs and ANNs were trained and tested for each split. The testing results of the support vector regression (SVR) model with combinations of parameter sets were compared to optimize the parameters of SVMs with different kernels. The out-of-sample generalization ability of different kernels was roughly high and the same. Therefore, RBF kernel was selected for comparison with ANNs with early stopping. The correlation coefficients between the predicted and observed SPI for the RBF kernel and ANN with early stopping were 0.997 and 0.995, implying the same performance of these two methods. The results indicated that because of some problems associated with ANNs (such as local minima), for cases in which there are quite comparable results for ANNs and SVMs, the usage of SVMs is preferable.

[1]  Zhongbo Yu,et al.  A multi-layer soil moisture data assimilation using support vector machines and ensemble particle filter , 2012 .

[2]  Wim Cornelis,et al.  A pseudo-continuous neural network approach for developing water retention pedotransfer functions with limited data , 2012 .

[3]  M. Hosseinzadeh,et al.  Assessment of heavy metals contamination and leaching characteristics in highway side soils, Iran , 2009, Environmental monitoring and assessment.

[4]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[5]  Xuezhi Wen,et al.  Recent Applications of Artificial Neural Networks in Forest Resource Management: An Overview , 1999 .

[6]  Jehn-Yih Juang,et al.  On the spectrum of soil moisture from hourly to interannual scales , 2007 .

[7]  R. Olea Geostatistics for Natural Resources Evaluation By Pierre Goovaerts, Oxford University Press, Applied Geostatistics Series, 1997, 483 p., hardcover, $65 (U.S.), ISBN 0-19-511538-4 , 1999 .

[8]  Majid Dehghani,et al.  Predicting the Longitudinal Dispersion Coefficient Using Support Vector Machine and Adaptive Neuro-Fuzzy Inference System Techniques , 2009 .

[9]  Ai-bing Ji,et al.  Support Vector Machine for Classification Based on Fuzzy Training Data , 2006, 2006 International Conference on Machine Learning and Cybernetics.

[10]  Massimiliano Pontil,et al.  Regularization and statistical learning theory for data analysis , 2002 .

[11]  Mac McKee,et al.  Sparse Bayesian learning machine for real‐time management of reservoir releases , 2005 .

[12]  Jichun Wu,et al.  Using support vector machines to predict cation exchange capacity of different soil horizons in Qingdao City, China , 2014 .

[13]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[14]  M. Jalali,et al.  EFFECT OF AGING PROCESS ON THE FRACTIONATION OF HEAVY METALS IN SOME CALCAREOUS SOILS OF IRAN , 2008 .

[15]  Qixing Zhou,et al.  Spatial, sources and risk assessment of heavy metal contamination of urban soils in typical regions of Shenyang, China. , 2010, Journal of hazardous materials.

[16]  Ali Moradzadeh,et al.  A statistical model to relate pyrite oxidation and oxygen transport within a coal waste pile: case study, Alborz Sharghi, northeast of Iran , 2014, Environmental Earth Sciences.

[17]  K. Loska,et al.  Assessment of arsenic enrichment of cultivated soils in Southern Poland , 2003 .

[18]  Peter L. M. Goethals,et al.  Application of classification trees and support vector machines to model the presence of macroinvertebrates in rivers in Vietnam , 2010, Ecol. Informatics.

[19]  Raghavan Srinivasan,et al.  Approximating SWAT Model Using Artificial Neural Network and Support Vector Machine 1 , 2009 .

[20]  M. McBride,et al.  Bioaccessibility of Ba, Cu, Pb, and Zn in urban garden and orchard soils. , 2016, Environmental pollution.

[21]  T. Ouarda,et al.  Estimation of water quality characteristics at ungauged sites using artificial neural networks and canonical correlation analysis , 2011 .

[22]  William W. Hsieh Machine Learning Methods in the Environmental Sciences: Feed-forward neural network models , 2009 .

[23]  Mohamed S. Kamel,et al.  On the optimal number of hidden nodes in a neural network , 1998, Conference Proceedings. IEEE Canadian Conference on Electrical and Computer Engineering (Cat. No.98TH8341).

[24]  Michael Edward Hohn,et al.  An Introduction to Applied Geostatistics: by Edward H. Isaaks and R. Mohan Srivastava, 1989, Oxford University Press, New York, 561 p., ISBN 0-19-505012-6, ISBN 0-19-505013-4 (paperback), $55.00 cloth, $35.00 paper (US) , 1991 .

[25]  Zhou Shi,et al.  Assessment and mapping of environmental quality in agricultural soils of Zhejiang Province, China , 2007 .

[26]  T. Siciliano,et al.  SEM-EDS investigation on PM10 data collected in Central Italy: Principal Component Analysis and Hierarchical Cluster Analysis , 2012, Chemistry Central Journal.

[27]  R. Gholami,et al.  Heavy metal pollution assessment using support vector machine in the Shur River, Sarcheshmeh copper mine, Iran , 2012, Environmental Earth Sciences.

[28]  I. Chaubey,et al.  Artificial Neural Networks Application in Lake Water Quality Estimation Using Satellite Imagery , 2004 .

[29]  Ozgur Kisi,et al.  Modeling of Dissolved Oxygen in River Water Using Artificial Intelligence Techniques , 2013 .

[30]  K. Lee,et al.  A comparative study of artificial neural networks and support vector machines for predicting groundwater levels in a coastal aquifer , 2011 .

[31]  William R. Cotton,et al.  Soil moisture estimation using an artificial neural network: a feasibility study , 2004 .

[32]  Arthur J. Caplan,et al.  Risk‐cost‐benefit analysis of atrazine in drinking water from agricultural activities and policy implications , 2005 .

[33]  Ralf Wieland,et al.  A new library to combine artificial neural networks and support vector machines with statistics and a database engine for application in environmental modeling , 2010, Environ. Model. Softw..

[34]  M. McKee,et al.  SOIL MOISTURE PREDICTION USING SUPPORT VECTOR MACHINES 1 , 2006 .

[35]  Ahmet Demir,et al.  Neural network prediction model for the methane fraction in biogas from field-scale landfill bioreactors , 2007, Environ. Model. Softw..

[36]  Grady Hanrahan,et al.  Artificial Neural Networks in Biological and Environmental Analysis , 2011 .

[37]  Maggi Kelly,et al.  Support vector machines for predicting distribution of Sudden Oak Death in California , 2005 .

[38]  Mac McKee,et al.  Applicability of statistical learning algorithms in groundwater quality modeling , 2005 .

[39]  Nemat Jaafarzadeh,et al.  A geochemical survey of heavy metals in agricultural and background soils of the Isfahan industrial zone, Iran , 2014 .

[40]  Dimitri P. Solomatine,et al.  Model Induction with Support Vector Machines: Introduction and Applications , 2001 .

[41]  Alireza Mesdaghinia,et al.  Effect of fertilizer application on soil heavy metal concentration , 2010, Environmental monitoring and assessment.

[42]  William W. Hsieh Machine Learning Methods in the Environmental Sciences: Contents , 2009 .

[43]  Brian J. Taylor,et al.  Methods and Procedures for the Verification and Validation of Artificial Neural Networks , 2005 .

[44]  Andrea Ruf,et al.  A maturity index for predatory soil mites (Mesostigmata: Gamasina) as an indicator of environmental impacts of pollution on forest soils , 1998 .

[45]  T. Ouarda,et al.  Flood frequency analysis at ungauged sites using artificial neural networks in canonical correlation analysis physiographic space , 2007 .

[46]  Anthony J. Jakeman,et al.  Artificial Intelligence techniques: An introduction to their use for modelling environmental systems , 2008, Math. Comput. Simul..

[47]  Yang Wang,et al.  Multivariate and geostatistical analyses of the spatial distribution and sources of heavy metals in agricultural soil in Dehui, Northeast China. , 2013, Chemosphere.

[48]  D. Bui,et al.  A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an Afromontane landscape. , 2015 .

[49]  Aleksander Astel,et al.  Application of neural-based modeling in an assessment of pollution with mercury in the middle part of the Warta River , 2009, Environmental monitoring and assessment.

[50]  Haifeng Chen,et al.  Comparative Study of QSAR/QSPR Correlations Using Support Vector Machines, Radial Basis Function Neural Networks, and Multiple Linear Regression , 2004, J. Chem. Inf. Model..

[51]  E. Prepas,et al.  The application of artificial neural networks to flow and phosphorus dynamics in small streams on the Boreal Plain, with emphasis on the role of wetlands , 2006 .

[52]  Gérard Dreyfus,et al.  Neural networks - methodology and applications , 2005 .

[53]  A. Moradzadeh,et al.  Geochemical characterisation of pyrite oxidation and environmental problems related to release and transport of metals from a coal washing low-grade waste dump, Shahrood, northeast Iran , 2011, Environmental monitoring and assessment.

[54]  R. Bilonick An Introduction to Applied Geostatistics , 1989 .

[56]  Shi Zhou,et al.  Assessment and mapping of environmental quality in agricultural soils of Zhejiang Province, China. , 2007, Journal of environmental sciences.

[57]  J.H.M. Wösten,et al.  Testing an Artificial Neural Network for Predicting Soil Hydraulic Conductivity , 1996 .

[58]  Patrick Van Damme,et al.  Use of support vector machines (SVMs) to predict distribution of an invasive water fern Azolla filiculoides (Lam.) in Anzali wetland, southern Caspian Sea, Iran , 2012 .

[59]  H. Khademi,et al.  Spatial Distribution of Magnetic Properties and Selected Heavy Metals in Calcareous Soils as Affected by Land Use in the Isfahan Region, Central Iran , 2012 .

[60]  Muttucumaru Sivakumar,et al.  Prediction of urban stormwater quality using artificial neural networks , 2009, Environ. Model. Softw..

[61]  I. Kita,et al.  Environmental Geochemistry of Soils and Waters of Susaki Area, Korinthos, Greece , 2001 .

[62]  Adam P. Piotrowski,et al.  A comparison of methods to avoid overfitting in neural networks training in the case of catchment runoff modelling , 2013 .

[63]  Qing-Song Xu,et al.  Support vector machines and its applications in chemistry , 2009 .

[64]  I. Gergen,et al.  Application of principal component analysis in the pollution assessment with heavy metals of vegetable food chain in the old mining areas , 2012, Chemistry Central Journal.