Predicting and Investigating the Permeability Coefficient of Soil with Aided Single Machine Learning Algorithm

The permeability coefficient of soils is an essential measure for designing geotechnical construction. The aim of this paper was to select a highest performance and reliable machine learning (ML) model to predict the permeability coefficient of soil and quantify the feature importance on the predicted value of the soil permeability coefficient with aided machine learning-based SHapley Additive exPlanations (SHAP) and Partial Dependence Plot 1D (PDP 1D). To acquire this purpose, five single ML algorithms including K-nearest neighbors (KNN), support vector machine (SVM), light gradient boosting machine (LightGBM), random forest (RF), and gradient boosting (GB) are used to build ML models for predicting the permeability coefficient of soils. Performance criteria for ML models include the coefficient of correlation R2, root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE). The best performance and reliable single ML model for predicting the permeability coefficient of soil for the testing dataset is the gradient boosting (GB) model, which has R2 = 0.971, RMSE = 0.199 × 10−11 m/s, MAE = 0.161 × 10−11 m/s, and MAPE = 0.185%. To identify and quantify the feature importance on the permeability coefficient of soil, sensitivity studies using permutation importance, SHapley Additive exPlanations (SHAP), and Partial Dependence Plot 1D (PDP 1D) are performed with the aided best performance and reliable ML model GB. Plasticity index, density > water content, liquid limit, and plastic limit > clay content > void ratio are the order effects on the predicted value of the permeability coefficient. The plasticity index and density of soil are the first priority soil properties to measure when assessing the permeability coefficient of soil.

[1]  Kazem Reza Kashyzadeh,et al.  Novel Approach to Predicting Soil Permeability Coefficient Using Gaussian Process Regression , 2022, Sustainability.

[2]  Hai-Van Thi Mai,et al.  Machine learning approach in investigating carbonation depth of concrete containing Fly ash , 2022, Structural Concrete.

[3]  C. Booth,et al.  Strength Predictive Modelling of Soils Treated with Calcium-Based Additives Blended with Eco-Friendly Pozzolans—A Machine Learning Approach , 2022, Materials.

[4]  Van Quan Tran Machine learning approach for investigating chloride diffusion coefficient of concrete containing supplementary cementitious materials , 2022, Construction and Building Materials.

[5]  V. Tran Hybrid gradient boosting with meta-heuristic algorithms prediction of unconfined compressive strength of stabilized soil based on initial soil properties, mix design and effective compaction , 2022, Journal of Cleaner Production.

[6]  V. Tran,et al.  Using machine learning technique for designing reinforced lightweight soil , 2022, J. Intell. Fuzzy Syst..

[7]  V. Tran,et al.  Developing random forest hybridization models for estimating the axial bearing capacity of pile , 2022, PloS one.

[8]  B. Pham,et al.  Hybrid Model: Teaching Learning-Based Optimization of Artificial Neural Network (TLBO-ANN) for the Prediction of Soil Permeability Coefficient , 2022, Mathematical Problems in Engineering.

[9]  Van Quan Tran,et al.  Evaluating compressive strength of concrete made with recycled concrete aggregates using machine learning approach , 2022, Construction and Building Materials.

[10]  Lanh Si Ho,et al.  A Comparative Study of Soft Computing Models for Prediction of Permeability Coefficient of Soil , 2021, Mathematical Problems in Engineering.

[11]  Lanh Si Ho,et al.  A Comparison of Gaussian Process and M5P for Prediction of Soil Permeability Coefficient , 2021, Sci. Program..

[12]  B. Šavija,et al.  Interpretable Ensemble-Machine-Learning models for predicting creep behavior of concrete , 2021, Cement and Concrete Composites.

[13]  H. Do,et al.  Prediction of California Bearing Ratio (CBR) of Stabilized Expansive Soils with Agricultural and Industrial Waste Using Light Gradient Boosting Machine , 2021, Journal of Science and Transport Technology.

[14]  M. Najafzadeh,et al.  A Novel Multiple-Kernel Support Vector Regression Algorithm for Estimation of Water Quality Parameters , 2021, Natural Resources Research.

[15]  Hai-Bang Ly,et al.  Prediction Compressive Strength of Concrete Containing GGBFS using Random Forest Model , 2021, Advances in Civil Engineering.

[16]  Long Khanh Nguyen,et al.  Investigation of ANN architecture for predicting shear strength of fiber reinforcement bars concrete beams , 2021, PloS one.

[17]  V. Tran Compressive Strength Prediction of Stabilized Dredged Sediments Using Artificial Neural Network , 2021 .

[18]  N. Lu,et al.  Correlation between Atterberg Limits and Soil Adsorptive Water , 2021 .

[19]  Hai-Bang Ly,et al.  Design deep neural network architecture using a genetic algorithm for estimation of pile bearing capacity , 2020, PloS one.

[20]  Tien-Thinh Le,et al.  A Novel Hybrid Model Based on a Feedforward Neural Network and One Step Secant Algorithm for Prediction of Load-Bearing Capacity of Rectangular Concrete-Filled Steel Tube Columns , 2020, Molecules.

[21]  Mohammad Najafzadeh,et al.  Riprap incipient motion for overtopping flows with machine learning models , 2020 .

[22]  Binh Thai Pham,et al.  Optimization of Artificial Intelligence System by Evolutionary Algorithm for Prediction of Axial Capacity of Rectangular Concrete Filled Steel Tubes under Compression , 2020, Materials.

[23]  Mohammad Najafzadeh,et al.  Receiving More Accurate Predictions for Longitudinal Dispersion Coefficients in Water Pipelines: Training Group Method of Data Handling Using Extreme Learning Machine Conceptions , 2020, Water Resources Management.

[24]  Mark Alexander,et al.  Performance-based approaches for concrete durability: State of the art and future research needs , 2019, Cement and Concrete Research.

[25]  P. Mukherjee,et al.  Estimating Optimal Additive Content for Soil Stabilization Using Machine Learning Methods , 2019, Geo-Congress 2019.

[26]  Parveen Sihag,et al.  Estimation of permeability of soil using easy measured soil parameters: assessing the artificial intelligence-based models , 2019, ISH Journal of Hydraulic Engineering.

[27]  Musab N. A. Salih,et al.  Application of Extreme Learning Machine (ELM) and Genetic Programming (GP) to design steel-concrete composite floor systems at elevated temperatures , 2019 .

[28]  Kenneth Jae T. Elevado COMPRESSIVE STRENGTH MODELLING OF CONCRETE MIXED WITH FLY ASH AND WASTE CERAMICS USING k-NEAREST NEIGHBOR ALGORITHM , 2018, International Journal of GEOMATE.

[29]  Mohammad Najafzadeh,et al.  GMDH-GEP to predict free span expansion rates below pipelines under waves , 2018 .

[30]  Zaher Mundher Yaseen,et al.  Predicting compressive strength of lightweight foamed concrete using extreme learning machine model , 2018, Adv. Eng. Softw..

[31]  Mohammad Najafzadeh,et al.  NF-GMDH-Based self-organized systems to predict bridge pier scour depth under debris flow effects , 2017 .

[32]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[33]  A. Elhakim Estimation of soil permeability , 2016 .

[34]  Erik Strumbelj,et al.  Explaining prediction models and individual predictions with feature contributions , 2014, Knowledge and Information Systems.

[35]  Alois Knoll,et al.  Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[36]  Rigoberto Fonseca,et al.  An assessment of ten-fold and Monte Carlo cross validations for time series forecasting , 2013, 2013 10th International Conference on Electrical Engineering, Computing Science and Automatic Control (CCE).

[37]  Anne-Laure Boulesteix,et al.  Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics , 2012, WIREs Data Mining Knowl. Discov..

[38]  E. Karthikeyan,et al.  Sigmis: A Feature Selection Algorithm Using Correlation Based Method , 2012 .

[39]  M. Marjanović,et al.  Landslide susceptibility assessment using SVM machine learning algorithm , 2011 .

[40]  Shaul Mordechai,et al.  Applications of Monte Carlo method in science and engineering , 2011 .

[41]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[42]  Walter L. Ruzzo,et al.  A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data , 2006, BMC Bioinformatics.

[43]  Mahesh Pal,et al.  Random forest classifier for remote sensing classification , 2005 .

[44]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[45]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[46]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[47]  Marcel G. Schaap,et al.  Saturated hydraulic conductivity prediction from microscopic pore geometry measurements and neural network analysis , 1999 .

[48]  J. Jeffrey Peirce,et al.  A Model for Estimating the Hydraulic Conductivity of Granular Material Based on Grain Shape, Grain Size, and Porosity , 1995 .

[49]  A. G. Altschaeffl,et al.  Pore Distribution and Permeability of Silty Clays , 1979 .

[50]  James K. Mitchell,et al.  Permeability of Compacted Clay , 1965 .

[51]  Roy E. Olson,et al.  Effective Stress Theory of Soil Compaction , 1963 .