Monthly suspended sediment load prediction using artificial intelligence: testing of a new random subspace method

ABSTRACT The predictive capability of a new artificial intelligence method, random subspace (RS), for the prediction of suspended sediment load in rivers was compared with commonly used methods: random forest (RF) and two support vector machine (SVM) models using a radial basis function kernel (SVM-RBF) and a normalized polynomial kernel (SVM-NPK). Using river discharge, rainfall and river stage data from the Haraz River, Iran, the results revealed: (a) the RS model provided a superior predictive accuracy (NSE = 0.83) to SVM-RBF (NSE = 0.80), SVM-NPK (NSE = 0.78) and RF (NSE = 0.68), corresponding to very good, good, satisfactory and unsatisfactory accuracies in load prediction; (b) the RBF kernel outperformed the NPK kernel; (c) the predictive capability was most sensitive to gamma and epsilon in SVM models, maximum depth of a tree and the number of features in RF models, classifier type, number of trees and subspace size in RS models; and (d) suspended sediment loads were most closely correlated with river discharge (PCC = 0.76). Overall, the results show that RS models have great potential in data poor watersheds, such as that studied here, to produce strong predictions of suspended load based on monthly records of river discharge, rainfall depth and river stage alone.

[1]  Aytug Onan,et al.  Classifier and feature set ensembles for web page classification , 2016, J. Inf. Sci..

[2]  O. Kisi,et al.  A genetic programming approach to suspended sediment modelling , 2008 .

[3]  Anna Malagó,et al.  Modelling sediment fluxes in the Danube River Basin with SWAT. , 2017, The Science of the total environment.

[4]  D. Bui,et al.  Spatial prediction of groundwater spring potential mapping based on an adaptive neuro-fuzzy inference system and metaheuristic optimization , 2018, Hydrology and Earth System Sciences.

[5]  Dan Ventura,et al.  A direct boosting algorithm for the k-nearest neighbor classifier via local warping of the distance metric , 2012, Pattern Recognit. Lett..

[6]  R. Abrahart,et al.  Flood estimation at ungauged sites using artificial neural networks , 2006 .

[7]  Dieu Tien Bui,et al.  Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS , 2017 .

[8]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[9]  Gavin C. Cawley,et al.  On Over-fitting in Model Selection and Subsequent Selection Bias in Performance Evaluation , 2010, J. Mach. Learn. Res..

[10]  Barnali M. Dixon,et al.  Multispectral landuse classification using neural networks and support vector machines: one or the other, or both? , 2008 .

[11]  Ozgur Kisi,et al.  Evaluation of data driven models for river suspended sediment concentration modeling , 2016 .

[12]  Biswajeet Pradhan,et al.  A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS , 2013, Comput. Geosci..

[13]  Biswajeet Pradhan,et al.  Hybrid artificial intelligence approach based on neural fuzzy inference model and metaheuristic optimization for flood susceptibilitgy modeling in a high-frequency tropical cyclone area using GIS , 2016 .

[14]  M. Çimen,et al.  Estimation of daily suspended sediments using support vector machines , 2008 .

[15]  Chuntian Cheng,et al.  Using support vector machines for long-term discharge prediction , 2006 .

[16]  Bing Li,et al.  Comparison of random forests and other statistical methods for the prediction of lake water level: a case study of the Poyang Lake in China , 2016 .

[17]  Jeffrey G. Arnold,et al.  Model Evaluation Guidelines for Systematic Quantification of Accuracy in Watershed Simulations , 2007 .

[18]  Taher Rajaee,et al.  Wavelet and ANN combination model for prediction of daily suspended sediment load in rivers. , 2011, The Science of the total environment.

[19]  B. Schröder,et al.  Estimation of suspended sediment concentration and yield using linear models, random forests and quantile regression forests , 2008 .

[20]  H. K. Cigizoglu,et al.  ESTIMATION AND FORECASTING OF DAILY SUSPENDED SEDIMENT DATA BY MULTI-LAYER PERCEPTRONS , 2004 .

[21]  A. Ahmadi,et al.  Daily suspended sediment load prediction using artificial neural networks and support vector machines , 2013 .

[22]  D. Legates,et al.  Evaluating the use of “goodness‐of‐fit” Measures in hydrologic and hydroclimatic model validation , 1999 .

[23]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[24]  Vahid Nourani USING ARTIFICIAL NEURAL NETWORKS (ANNs) FOR SEDIMENT LOAD FORECASTING OF TALKHEROOD RIVER MOUTH , 2009 .

[25]  Bofu Yu,et al.  Streamflow and Sediment Yield Prediction for Watershed Prioritization in the Upper Blue Nile River Basin, Ethiopia , 2017 .

[26]  Zaher Mundher Yaseen,et al.  Determination of compound channel apparent shear stress: application of novel data mining models , 2019, Journal of Hydroinformatics.

[27]  Xixi Lu,et al.  Sediment deposition and erosion during the extreme flood events in the middle and lower reaches of the Yangtze River , 2010 .

[28]  Jan Mielniczuk,et al.  Using random subspace method for prediction and variable importance assessment in linear regression , 2014, Comput. Stat. Data Anal..

[29]  H. Pourghasemi,et al.  A GIS-based flood susceptibility assessment and its mapping in Iran: a comparison between frequency ratio and weights-of-evidence bivariate statistical models with multi-criteria decision-making technique , 2016, Natural Hazards.

[30]  Biswajeet Pradhan,et al.  Suitability estimation for urban development using multi-hazard assessment map. , 2017, The Science of the total environment.

[31]  Fabián A. Bombardelli,et al.  Theoretical/numerical model for the transport of non-uniform suspended sediment in open channels , 2011 .

[32]  B. Pradhan,et al.  Landslide Susceptibility Assessment in Vietnam Using Support Vector Machines, Decision Tree, and Naïve Bayes Models , 2012 .

[33]  Ahmad Sharafati,et al.  The potential of novel data mining models for global solar radiation prediction , 2019, International Journal of Environmental Science and Technology.

[34]  Hamid Darabi,et al.  River suspended sediment modelling using the CART model: A comparative study of machine learning techniques. , 2018, The Science of the total environment.

[35]  Özgür Kişi,et al.  Multi-layer perceptrons with Levenberg-Marquardt training algorithm for suspended sediment concentration prediction and estimation / Prévision et estimation de la concentration en matières en suspension avec des perceptrons multi-couches et l’algorithme d’apprentissage de Levenberg-Marquardt , 2004 .

[36]  Dieu Tien Bui,et al.  A novel hybrid artificial intelligence approach for flood susceptibility assessment , 2017, Environ. Model. Softw..

[37]  Kim Falinski,et al.  Sediment delivery modeling in practice: Comparing the effects of watershed characteristics and data resolution across hydroclimatic regions. , 2017, The Science of the total environment.

[38]  Dirk Wenske,et al.  Assessment of sediment delivery from successive erosion on stream-coupled hillslopes via a time series of topographic surveys in the central high mountain range of Taiwan , 2012 .

[39]  O. Kisi,et al.  Suspended sediment modeling using genetic programming and soft computing techniques , 2012 .

[40]  Erdem Bilgili,et al.  Random subspace method with class separability weighting , 2016, Expert Syst. J. Knowl. Eng..

[41]  Turgay Partal,et al.  Estimation and forecasting of daily suspended sediment data using wavelet–neural networks , 2008 .

[42]  Sajjad Ahmad,et al.  Suspended sediment load prediction of river systems: An artificial neural network approach , 2011 .

[43]  Changxing Shi,et al.  Sediment rating curves in the Ningxia-Inner Mongolia reaches of the upper Yellow River and their implications , 2012 .

[44]  Saso Dzeroski,et al.  Combining Bagging and Random Subspaces to Create Better Ensembles , 2007, IDA.

[45]  Sharad K. Jain,et al.  Development of Integrated Sediment Rating Curves Using ANNs , 2001 .

[46]  H. Md. Azamathulla,et al.  Support vector machine approach for longitudinal dispersion coefficients in natural streams , 2011, Appl. Soft Comput..

[47]  Mustafa Neamah Jebur,et al.  Flood susceptibility mapping using a novel ensemble weights-of-evidence and support vector machine models in GIS , 2014 .

[48]  Min Wu,et al.  Phosphorus storage dynamics and adsorption characteristics for sediment from a drinking water source reservoir and its relation with sediment compositions , 2014 .

[49]  V. Tsihrintzis,et al.  Hydrologic and Water Quality Modeling of Lower Nestos River Basin , 2012, Water Resources Management.

[50]  Luca Mao,et al.  Temporal dynamics of suspended sediment transport in a glacierized Andean basin , 2017 .

[51]  Jie-Lun Chiang,et al.  Suspended sediment load estimate using support vector machines in Kaoping river basin , 2011, 2011 International Conference on Consumer Electronics, Communications and Networks (CECNet).

[52]  Dimitri P. Solomatine,et al.  Model Induction with Support Vector Machines: Introduction and Applications , 2001 .

[53]  Tadashi Suetsugi,et al.  Comparison of regionalization approaches in parameterizing sediment rating curve in ungauged catchments for subsequent instantaneous sediment yield prediction , 2014 .

[54]  B. Pham,et al.  A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. , 2018, The Science of the total environment.

[55]  Khabat Khosravi,et al.  Application and Comparison of Decision Tree-Based Machine Learning Methods in Landside Susceptibility Assessment at Pauri Garhwal Area, Uttarakhand, India , 2017, Environmental Processes.