Groundwater spring potential modelling: Comprising the capability and robustness of three different modeling approaches

Abstract Sustainable water resources management in arid and semi-arid areas needs robust models, which allow accurate and reliable predictive modeling. This issue has motivated the researchers to develop hybrid models that offer solutions on modelling problems and accurate predictions of groundwater potential zonation. For this purpose, this research aims to investigate the capability and robustness of a novel hybrid model, namely the logistic model tree (LMT) and compares it with state-of-the-art models such as the support vector machine and C4.5 models that locate potential zones for groundwater springs. A spring location dataset consisting of 359 springs was provided by field surveys and national reports and from which three different sample data sets (S1–S3) were randomly prepared (70% for training and 30% for validation). Additionally, 16 spring-related factors were analyzed using regression logistic analysis to find which factors play a significant role in spring occurrence. Twelve significant geo-environmental and morphometric factors were identified and applied in all models. The accuracy of models was evaluated by three different threshold-dependent and –Independent methods including efficiency (E), true skill statistic (TSS), and area under the receiver operating characteristics curve (AUC-ROC) methods. Results showed that the LMT model had the highest accuracy performance for all three validation datasets (Emean = 0.860, TSSmean = 0.718, AUC-ROCmean = 0.904); although a slight sensitivity to change in input data was sometimes observed for this model. Furthermore, the findings showed that relative slope position (RSP) was the most important factor followed by distance from faults and lithology.

[1]  E. Rotigliano,et al.  Improving transferability strategies for debris flow susceptibility assessment: Application to the Saponara and Itala catchments (Messina, Italy) , 2017 .

[2]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[3]  B. Pradhan,et al.  Application of GIS based data driven evidential belief function model to predict groundwater potential zonation , 2014 .

[4]  Mustafa Neamah Jebur,et al.  Earthquake induced landslide susceptibility mapping using an integrated ensemble frequency ratio and logistic regression models in West Sumatera Province, Indonesia , 2014 .

[5]  A. Corsini,et al.  Weight of evidence and artificial neural networks for potential groundwater spring mapping: an application to the Mt. Modino area (Northern Apennines, Italy) , 2009 .

[6]  Rahim Barzegar,et al.  Mapping groundwater contamination risk of multiple aquifers using multi-model ensemble of machine learning algorithms. , 2018, The Science of the total environment.

[7]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[8]  Bahareh Kalantar,et al.  Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN) , 2018 .

[9]  Jyh-Jong Liao,et al.  Logistic regression model for predicting the failure probability of a landslide dam , 2011 .

[10]  Sirkka Tattari,et al.  Suspended solids and total phosphorus loads and their spatial differences in a lake-rich river basin as determined by automatic monitoring network , 2015, Environmental Monitoring and Assessment.

[11]  Dieu Tien Bui,et al.  A novel hybrid artificial intelligence approach for flood susceptibility assessment , 2017, Environ. Model. Softw..

[12]  Charles E. McCulloch,et al.  FACTORS CONTROLLING SPATIAL VARIATION OF TREE SPECIES ABUNDANCE IN A FORESTED LANDSCAPE , 2003 .

[13]  Bernard De Baets,et al.  Habitat prediction and knowledge extraction for spawning European grayling (Thymallus thymallus L.) using a broad range of species distribution models , 2013, Environ. Model. Softw..

[14]  Giovanni B. Crosta,et al.  Techniques for evaluating the performance of landslide susceptibility models , 2010 .

[15]  Peter A. Vanrolleghem,et al.  Uncertainty in the environmental modelling process - A framework and guidance , 2007, Environ. Model. Softw..

[16]  A. Al-Abadi,et al.  A comparison between index of entropy and catastrophe theory methods for mapping groundwater potential in an arid region , 2015, Environmental Monitoring and Assessment.

[17]  Hamid Reza Pourghasemi,et al.  Application of analytical hierarchy process, frequency ratio, and certainty factor models for groundwater potential mapping using GIS , 2015, Earth Science Informatics.

[18]  T. Kavzoglu,et al.  An assessment of multivariate and bivariate approaches in landslide susceptibility mapping: a case study of Duzkoy district , 2015, Natural Hazards.

[19]  B. Pradhan,et al.  A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility , 2017 .

[20]  Michael Märker,et al.  Water erosion susceptibility mapping by applying Stochastic Gradient Treeboost to the Imera Meridionale River Basin (Sicily, Italy) , 2016 .

[21]  Wei Chen,et al.  A comparative study on groundwater spring potential analysis based on statistical index, index of entropy and certainty factors models , 2018 .

[22]  P. Samui Slope stability analysis: a support vector machine approach , 2008 .

[23]  Nguyen Quoc Thanh,et al.  Spatial prediction of rainfall-induced landslides for the Lao Cai area (Vietnam) using a hybrid intelligent approach of least squares support vector machines inference model and artificial bee colony optimization , 2017, Landslides.

[24]  Bahareh Kalantar,et al.  Groundwater potential mapping using C5.0, random forest, and multivariate adaptive regression spline models in GIS , 2018, Environmental Monitoring and Assessment.

[25]  S. Anbazhagan,et al.  Geoinformatics in groundwater potential mapping and sustainable development: a case study from southern India , 2016 .

[26]  Saro Lee,et al.  GIS mapping of regional probabilistic groundwater potential in the area of Pohang City, Korea , 2011 .

[27]  Martijn J. Booij,et al.  Simulation and forecasting of streamflows using machine learning models coupled with base flow separation , 2018, Journal of Hydrology.

[28]  B. Pradhan,et al.  A novel hybrid evidential belief function-based fuzzy logic model in spatial prediction of rainfall-induced shallow landslides in the Lang Son city area (Vietnam) , 2015 .

[29]  Seyed Amir Naghibi,et al.  Evaluation of four supervised learning methods for groundwater spring potential mapping in Khalkhal region (Iran) using GIS-based features , 2017, Hydrogeology Journal.

[30]  V. Chowdary,et al.  Integrated remote sensing and GIS‐based approach for assessing groundwater potential in West Medinipur district, West Bengal, India , 2009 .

[31]  Iman Nasiri Aghdam,et al.  A new hybrid model using Step-wise Weight Assessment Ratio Analysis (SWARA) technique and Adaptive Neuro-fuzzy Inference System (ANFIS) for regional landslide hazard assessment in Iran , 2015 .

[32]  B. Pradhan,et al.  A comparative assessment of prediction capabilities of Dempster–Shafer and Weights-of-evidence models in landslide susceptibility mapping using GIS , 2013 .

[33]  Prashant K. Srivastava,et al.  Integrating GIS and remote sensing for identification of groundwater potential zones in the hilly terrain of Pavagarh, Gujarat, India , 2010 .

[34]  Omid Rahmati,et al.  Applicability of generalized additive model in groundwater potential modelling and comparison its performance by bivariate statistical methods , 2017 .

[35]  Anirban Dhar,et al.  Appraising the Accuracy of Multi-Class Frequency Ratio and Weights of Evidence Method for Delineation of Regional Groundwater Potential Zones in Canal Command System , 2017, Water Resources Management.

[36]  Biswajeet Pradhan,et al.  Groundwater vulnerability assessment using an improved DRASTIC method in GIS , 2014 .

[37]  Biswajeet Pradhan,et al.  A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India) , 2016, Environ. Model. Softw..

[38]  Mohammad Bannayan,et al.  Detection of recent climate change using daily ­temperature extremes in Khorasan Province, Iran , 2011 .

[39]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[40]  Omid Rahmati,et al.  Application of Dempster-Shafer theory, spatial analysis and remote sensing for groundwater potentiality and nitrate pollution analysis in the semi-arid region of Khuzestan, Iran. , 2016, The Science of the total environment.

[41]  Philippe De Maeyer,et al.  Application of the topographic position index to heterogeneous landscapes , 2013 .

[42]  Hamid Reza Pourghasemi,et al.  A comparative assessment between linear and quadratic discriminant analyses (LDA-QDA) with frequency ratio and weights-of-evidence models for forest fire susceptibility mapping in China , 2017, Arabian Journal of Geosciences.

[43]  Young-Chan Lee,et al.  Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters , 2005, Expert Syst. Appl..

[44]  Robertas Damasevicius Structural analysis of regulatory DNA sequences using grammar inference and Support Vector Machine , 2010, Neurocomputing.

[45]  Seyed Amir Naghibi,et al.  GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran , 2015, Environmental Monitoring and Assessment.

[46]  Arvind Pandey,et al.  Delineation of groundwater potential zone in hard rock terrain of India using remote sensing, geographical information system (GIS) and analytic hierarchy process (AHP) techniques , 2015 .

[47]  A. Afrasiabian The exploration drilling in karstic resources in Iran , 1986 .

[48]  H. Pourghasemi,et al.  Groundwater potential mapping at Kurdistan region of Iran using analytic hierarchy process and GIS , 2015, Arabian Journal of Geosciences.

[49]  Wei Chen,et al.  GIS-based groundwater potential analysis using novel ensemble weights-of-evidence with logistic regression and functional tree models. , 2018, The Science of the total environment.

[50]  Pulak Mishra,et al.  Delineation of groundwater potential zone for sustainable development: A case study from Ganga Alluvial Plain covering Hooghly district of India using remote sensing, geographic information system and analytic hierarchy process , 2018 .

[51]  Ismail Chenini,et al.  Groundwater recharge study in arid region: An approach using GIS techniques and numerical modeling , 2010, Computational Geosciences.

[52]  J. M. Faci,et al.  Elevation and infiltration in a level basin. II. Impact on soil water and corn yield , 2000, Irrigation Science.

[53]  Wei Chen,et al.  Spatial prediction of landslide susceptibility using an adaptive neuro-fuzzy inference system combined with frequency ratio, generalized additive model, and support vector machine techniques , 2017, Geomorphology.

[54]  Yannick Thiery,et al.  Which data for quantitative landslide susceptibility mapping at operational scale? Case study of the Pays d'Auge plateau hillslopes (Normandy, France) , 2013 .

[55]  E. Rotigliano,et al.  Gully erosion susceptibility assessment by means of GIS-based logistic regression: A case of Sicily (Italy) , 2014 .

[56]  Saro Lee,et al.  Application of a weights-of-evidence method and GIS to regional groundwater productivity potential mapping. , 2012, Journal of environmental management.

[57]  C. F. Lee,et al.  Assessment of landslide susceptibility on the natural terrain of Lantau Island, Hong Kong , 2001 .

[58]  Mohammad Rezaie-Balf,et al.  Wavelet coupled MARS and M5 Model Tree approaches for groundwater level forecasting , 2017 .

[59]  Binbin He,et al.  A method for mineral prospectivity mapping integrating C4.5 decision tree, weights-of-evidence and m-branch smoothing techniques: a case study in the eastern Kunlun Mountains, China , 2014, Earth Science Informatics.

[60]  Tsutomu Yamanaka,et al.  Tracing groundwater recharge sources in a mountain–plain transitional area using stable isotopes and hydrochemistry , 2012 .

[61]  Robert I. McDonald,et al.  Modeling Landscape Vegetation Pattern in Response to Historic Land-use: A Hypothesis-driven Approach for the North Carolina Piedmont, USA , 2005, Landscape Ecology.

[62]  A. Ozdemir Using a binary logistic regression method and GIS for evaluating and mapping the groundwater spring potential in the Sultan Mountains (Aksehir, Turkey) , 2011 .

[63]  Thomas M. Missimer,et al.  Water Resources Assessment Methods: Assessment of Groundwater Resources , 2012 .

[64]  Jeerayut Chaijaruwanich,et al.  HIV-1 CRF01_AE coreceptor usage prediction using kernel methods based logistic model trees , 2012, Comput. Biol. Medicine.

[65]  Biswajeet Pradhan,et al.  Assessment of land cover and land use change impact on soil loss in a tropical catchment by using multitemporal SPOT‐5 satellite images and Revised Universal Soil Loss Equation model , 2018, Land Degradation & Development.

[66]  E. Rotigliano,et al.  Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: A case of the Belice River basin (western Sicily, Italy) , 2015 .

[67]  Jung Hyun Lee,et al.  A novel ensemble bivariate statistical evidential belief function with knowledge-based analytical hierarchy process and multivariate statistical logistic regression for landslide susceptibility mapping , 2014 .

[68]  H. Pourghasemi,et al.  Application of GIS-based data driven random forest and maximum entropy models for groundwater potential mapping: A case study at Mehran Region, Iran , 2016 .

[69]  Seyed Amir Naghibi,et al.  A comparative assessment of GIS-based data mining models and a novel ensemble model in groundwater well potential mapping , 2017 .

[70]  J. Elliott,et al.  Machine learning algorithms for modeling groundwater level changes in agricultural regions of the U.S. , 2017 .

[71]  Bijaya K. Panigrahi,et al.  An integrated wavelet-support vector machine for groundwater level prediction in Visakhapatnam, India , 2014, Neurocomputing.

[72]  A. Ozdemir GIS-based groundwater spring potential mapping in the Sultan Mountains (Konya, Turkey) using frequency ratio, weights of evidence and logistic regression methods and their comparison , 2011 .

[73]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[74]  Mustafa Neamah Jebur,et al.  Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS , 2013 .

[75]  Chandranath Chatterjee,et al.  Development of an accurate and reliable hourly flood forecasting model using wavelet–bootstrap–ANN (WBANN) hybrid approach , 2010 .

[76]  S. Weiss,et al.  GLM versus CCA spatial modeling of plant species distribution , 1999, Plant Ecology.

[77]  Zohre Sadat Pourtaghi,et al.  GIS-based multivariate adaptive regression spline and random forest models for groundwater potential mapping in Iran , 2016, Environmental Earth Sciences.

[78]  Balamurugan Guru,et al.  Frequency ratio model for groundwater potential mapping and its sustainable management in cold desert, India , 2017 .

[79]  B. Pradhan,et al.  Groundwater spring potential mapping using bivariate statistical model and GIS in the Taleghan Watershed, Iran , 2015, Arabian Journal of Geosciences.

[80]  Hamid Reza Pourghasemi,et al.  Comparison of differences in resolution and sources of controlling factors for gully erosion susceptibility mapping , 2018, Geoderma.

[81]  A. Shakoor,et al.  A GIS-based landslide susceptibility evaluation using bivariate and multivariate statistical analyses , 2010 .

[82]  Cristiano Ballabio,et al.  Support Vector Machines for Landslide Susceptibility Mapping: The Staffora River Basin Case Study, Italy , 2012, Mathematical Geosciences.

[83]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[84]  Anirban Dhar,et al.  Grey analytic hierarchy process applied to effectiveness evaluation for groundwater potential zone delineation , 2017 .

[85]  Todd R. Lookingbill,et al.  An empirical approach towards improved spatial estimates of soil moisture for vegetation analysis , 2004, Landscape Ecology.

[86]  Hamid Reza Pourghasemi,et al.  Evaluation of different machine learning models for predicting and mapping the susceptibility of gully erosion , 2017 .

[87]  B. Pradhan Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia , 2010 .

[88]  Omid Rahmati,et al.  Delineation of groundwater potential zones using remote sensing and GIS-based data-driven models , 2016 .

[89]  H. Pourghasemi,et al.  Performance assessment of individual and ensemble data-mining techniques for gully erosion modeling. , 2017, The Science of the total environment.

[90]  B. Pradhan,et al.  A knowledge-driven GIS modeling technique for groundwater potential mapping at the Upper Langat Basin, Malaysia , 2013, Arabian Journal of Geosciences.

[91]  Taskin Kavzoglu,et al.  The use of logistic model tree (LMT) for pixel- and object-based classifications using high-resolution WorldView-2 imagery , 2017 .

[92]  H. S. Lim,et al.  Regional prediction of groundwater potential mapping in a multifaceted geology terrain using GIS-based Dempster–Shafer model , 2015, Arabian Journal of Geosciences.

[93]  R. A. MacMillan,et al.  A generic procedure for automatically segmenting landforms into landform elements using DEMs, heuristic rules and fuzzy logic , 2000, Fuzzy Sets Syst..

[94]  B C Sarkar,et al.  A Geographic Information System approach to evaluation of groundwater potentiality of Shamri micro-watershed in the Shimla Taluk, Himachal Pradesh , 2001 .

[95]  R. Thuraisingham,et al.  On multiscale entropy analysis for physiological data , 2006 .

[96]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[97]  K. A. N. Adiat,et al.  Assessing the accuracy of GIS-based elementary multi criteria decision analysis as a spatial prediction tool – A case of predicting potential zones of sustainable groundwater resources , 2012 .

[98]  E. Yesilnacar,et al.  Landslide susceptibility mapping : A comparison of logistic regression and neural networks methods in a medium scale study, Hendek Region (Turkey) , 2005 .

[99]  B. Pradhan,et al.  Landslide Susceptibility Assessment in Vietnam Using Support Vector Machines, Decision Tree, and Naïve Bayes Models , 2012 .

[100]  Christian Igel,et al.  Evolutionary tuning of multiple SVM parameters , 2005, ESANN.

[101]  Omri Allouche,et al.  Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS) , 2006 .

[102]  Hamid Reza Pourghasemi,et al.  A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS , 2018, Theoretical and Applied Climatology.

[103]  Mustafa Neamah Jebur,et al.  Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS , 2014 .

[104]  Jiuchuan Wei,et al.  A GIS-based model of potential groundwater yield zonation for a sandstone aquifer in the Juye Coalfield, Shangdong, China , 2018 .

[105]  Biswajeet Pradhan,et al.  A hybrid artificial intelligence approach using GIS-based neural-fuzzy inference system and particle swarm optimization for forest fire susceptibility modeling at a tropical area , 2017 .

[106]  Kang-Kun Lee,et al.  A method to improve the stability and accuracy of ANN- and SVM-based time series models for long-term groundwater level predictions , 2016, Computational Geosciences.

[107]  Mahyat Shafapour Tehrany,et al.  Flood susceptibility assessment using GIS-based support vector machine model with different kernel types , 2015 .

[108]  Zili Zhang,et al.  Missing Value Estimation for Mixed-Attribute Data Sets , 2011, IEEE Transactions on Knowledge and Data Engineering.

[109]  Seyed Amir Naghibi,et al.  A Comparative Assessment Between Three Machine Learning Models and Their Performance Comparison by Bivariate and Multivariate Statistical Methods in Groundwater Potential Mapping , 2015, Water Resources Management.

[110]  Young-Kwang Yeon,et al.  Landslide susceptibility mapping in Injae, Korea, using a decision tree , 2010 .

[111]  Biswajeet Pradhan,et al.  Application of a neuro-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area , 2011, Comput. Geosci..

[112]  Biswajeet Pradhan,et al.  Application of an evidential belief function model in landslide susceptibility mapping , 2012, Comput. Geosci..

[113]  P. Sander,et al.  Lineaments in groundwater exploration: a review of applications and limitations , 2007 .

[114]  Hyun-Joo Oh,et al.  Assessment of ground subsidence using GIS and the weights-of-evidence model , 2010 .

[115]  Seyed Amir Naghibi,et al.  Application of Support Vector Machine, Random Forest, and Genetic Algorithm Optimized Random Forest Models in Groundwater Potential Mapping , 2017, Water Resources Management.