Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance

Abstract Landslides are one of the catastrophic natural hazards that occur in mountainous areas, leading to loss of life, damage to properties, and economic disruption. Landslide susceptibility models prepared in a Geographic Information System (GIS) integrated environment can be key for formulating disaster prevention measures and mitigating future risk. The accuracy and precision of susceptibility models is evolving rapidly from opinion-driven models and statistical learning toward increased use of machine learning techniques. Critical reviews on opinion-driven models and statistical learning in landslide susceptibility mapping have been published, but an overview of current machine learning models for landslide susceptibility studies, including background information on their operation, implementation, and performance is currently lacking. Here, we present an overview of the most popular machine learning techniques available for landslide susceptibility studies. We find that only a handful of researchers use machine learning techniques in landslide susceptibility mapping studies. Therefore, we present the architecture of various Machine Learning (ML) algorithms in plain language, so as to be understandable to a broad range of geoscientists. Furthermore, a comprehensive study comparing the performance of various ML algorithms is absent from the current literature, making an assessment of comparative performance and predictive capabilities difficult. We therefore undertake an extensive analysis and comparison between different ML techniques using a case study from Algeria. We summarize and discuss the algorithm's accuracies, advantages and limitations using a range of evaluation criteria. We note that tree-based ensemble algorithms achieve excellent results compared to other machine learning algorithms and that the Random Forest algorithm offers robust performance for accurate landslide susceptibility mapping with only a small number of adjustments required before training the model.

[1]  Biswajeet Pradhan,et al.  A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS , 2013, Comput. Geosci..

[2]  Yulong Chen,et al.  Relationship between water content, shear deformation, and elastic wave velocity through unsaturated soil slope , 2020, Bulletin of Engineering Geology and the Environment.

[3]  Tao Guo,et al.  Landslide Susceptibility Mapping Based on Weighted Gradient Boosting Decision Tree in Wanzhou Section of the Three Gorges Reservoir Area (China) , 2018, ISPRS Int. J. Geo Inf..

[4]  T. Kavzoglu,et al.  Landslide susceptibility mapping using GIS-based multi-criteria decision analysis, support vector machines, and logistic regression , 2014, Landslides.

[5]  A. Ribolini,et al.  Logistic regression versus artificial neural networks: landslide susceptibility evaluation in a sample area of the Serchio River valley, Italy , 2009 .

[6]  Jie Dou,et al.  A Comparative Study of PSO-ANN, GA-ANN, ICA-ANN, and ABC-ANN in Estimating the Heating Load of Buildings’ Energy Efficiency for Smart City Planning , 2019, Applied Sciences.

[7]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[8]  L. Venkataratnam,et al.  A Spatially Distributed Event-Based Model to Predict Sediment Yield , 2005 .

[9]  Jie Dou,et al.  Preliminary analyses of a catastrophic landslide occurred on July 23, 2019, in Guizhou Province, China , 2020, Landslides.

[10]  Markus A. Reuter,et al.  The application of neural nets in the metallurgical industry , 1994 .

[11]  Damaris Zurell,et al.  Collinearity: a review of methods to deal with it and a simulation study evaluating their performance , 2013 .

[12]  Harry Zhang,et al.  The Optimality of Naive Bayes , 2004, FLAIRS.

[13]  Ali P. Yunus,et al.  Different sampling strategies for predicting landslide susceptibilities are deemed less consequential with deep learning. , 2020, The Science of the total environment.

[14]  P. Reichenbach,et al.  A review of statistically-based landslide susceptibility models , 2018 .

[15]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[16]  B. S. Harish,et al.  Classification of Internet Traffic Data Using Ensemble Method , 2020 .

[17]  Simon Haykin,et al.  Support vector machines for dynamic reconstruction of a chaotic system , 1999 .

[18]  Veronica Tofani,et al.  Landslide susceptibility estimation by random forests technique: sensitivity and scaling issues , 2013 .

[19]  Nazri Mohd Nawi,et al.  An Improved Learning Algorithm Based on The Broyden-Fletcher-Goldfarb-Shanno (BFGS) Method For Back Propagation Neural Networks , 2006, Sixth International Conference on Intelligent Systems Design and Applications.

[20]  Francisco Herrera,et al.  Analysis of preprocessing vs. cost-sensitive learning for imbalanced classification. Open problems on intrinsic data characteristics , 2012, Expert Syst. Appl..

[21]  Brian D. Ripley,et al.  Statistical aspects of neural networks , 1993 .

[22]  Ying Wang,et al.  Susceptibility of reservoir-induced landslides and strategies for increasing the slope stability in the Three Gorges Reservoir Area: Zigui Basin as an example , 2019, Engineering Geology.

[23]  Fuchu Dai,et al.  Landslide risk assessment and management: an overview , 2002 .

[24]  A. Trigila,et al.  Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy) , 2015 .

[25]  Isik Yilmaz,et al.  Landslide susceptibility mapping using frequency ratio, logistic regression, artificial neural networks and their comparison: A case study from Kat landslides (Tokat - Turkey) , 2009, Comput. Geosci..

[26]  Nadhir Al-Ansari,et al.  Shallow Landslide Susceptibility Mapping: A Comparison between Logistic Model Tree, Logistic Regression, Naïve Bayes Tree, Artificial Neural Network, and Support Vector Machine Algorithms , 2020, International journal of environmental research and public health.

[27]  Ethem Alpaydin,et al.  Introduction to machine learning , 2004, Adaptive computation and machine learning.

[28]  Yehoshua Bar-Hillel,et al.  The Intrinsic Computational Difficulty of Functions , 1969 .

[29]  P. Martin Mai,et al.  Presenting logistic regression-based landslide susceptibility results , 2018, Engineering Geology.

[30]  Jin Zhang,et al.  Comparative Assessment of Three Nonlinear Approaches for Landslide Susceptibility Mapping in a Coal Mine Area , 2017, ISPRS Int. J. Geo Inf..

[31]  Wei Chen,et al.  GIS-based landslide susceptibility modelling: a comparative assessment of kernel logistic regression, Naïve-Bayes tree, and alternating decision tree models , 2017 .

[32]  David D. Cox,et al.  Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures , 2013, ICML.

[33]  Robert A. Schowengerdt,et al.  A review and analysis of backpropagation neural networks for classification of remotely-sensed multi-spectral imagery , 1995 .

[34]  P. Reichenbach,et al.  Combined landslide inventory and susceptibility assessment based on different mapping units: an example from the Flemish Ardennes, Belgium , 2009 .

[35]  F. Guzzetti,et al.  Landslide inventory maps: New tools for an old problem , 2012 .

[36]  J. Buckley,et al.  Fuzzy neural networks: a survey , 1994 .

[37]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[38]  Branislav Bajat,et al.  Landslide Susceptibility Assessment with Machine Learning Algorithms , 2009, 2009 International Conference on Intelligent Networking and Collaborative Systems.

[39]  Hamid Reza Pourghasemi,et al.  Erratum to: Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia , 2016, Landslides.

[40]  P. Reichenbach,et al.  Optimal landslide susceptibility zonation based on multiple forecasts , 2010 .

[41]  H. Pourghasemi,et al.  An integrated artificial neural network model for the landslide susceptibility assessment of Osado Island, Japan , 2015, Natural Hazards.

[42]  Jong-Shin Chen,et al.  A kNN Based Position Prediction Method for SNS Places , 2020, ACIIDS.

[43]  Thomas Stanley,et al.  A heuristic approach to global landslide susceptibility mapping , 2017, Natural Hazards.

[44]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[45]  L. Ayalew,et al.  The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan , 2005 .

[46]  Anastasios P. Vassilopoulos,et al.  Novel computational methods for fatig life modeling of composite materials , 2010 .

[47]  Alexander Brenning,et al.  Assessing the quality of landslide susceptibility maps – case study Lower Austria , 2014 .

[48]  P. Atkinson,et al.  A systematic review of landslide probability mapping using logistic regression , 2015, Landslides.

[49]  Quoc-Phi Nguyen,et al.  A novel fuzzy K-nearest neighbor inference model with differential evolution for spatial prediction of rainfall-induced shallow landslides in a tropical hilly area using GIS , 2017, Landslides.

[50]  Ali P. Yunus,et al.  Evaluating scale effects of topographic variables in landslide susceptibility models using GIS-based machine learning techniques , 2019, Scientific Reports.

[51]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[52]  S. Reis,et al.  A GIS-based comparative study of frequency ratio, analytical hierarchy process, bivariate statistics , 2011 .

[53]  Jie Dou,et al.  Evaluating GIS-Based Multiple Statistical Models and Data Mining for Earthquake and Rainfall-Induced Landslide Susceptibility Using the LiDAR DEM , 2019, Remote. Sens..

[54]  B. Pham,et al.  Assessment of advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima Volcanic Island, Japan. , 2019, The Science of the total environment.

[55]  Sultan Kocaman,et al.  A Novel Performance Assessment Approach Using Photogrammetric Techniques for Landslide Susceptibility Mapping with Logistic Regression, ANN and Random Forest , 2019, Sensors.

[56]  Leo Breiman,et al.  Random Forests: Finding Quasars , 2003 .

[57]  Milton S. Boyd,et al.  Designing a neural network for forecasting financial and economic time series , 1996, Neurocomputing.

[58]  Martin Fodslette Møller,et al.  A scaled conjugate gradient algorithm for fast supervised learning , 1993, Neural Networks.

[59]  P. Vamplew,et al.  A Comparative Study of Various Data Mining Techniques as applied to the Modeling of Landslide Susceptibility on the Bellarine Peninsula, Victoria, Australia , 2010 .

[60]  Mehdi Teimouri,et al.  Comparison of Neural Network and K-Nearest Neighbor Methods in Daily Flow Forecasting , 2010 .

[61]  Peter Rogerson,et al.  Statistical methods for geography , 2001 .

[62]  Y. Hong,et al.  A global landslide catalog for hazard applications: method, results, and limitations , 2010 .

[63]  I. Kanellopoulos,et al.  Strategies and best practice for neural network image classification , 1997 .

[64]  D. Basak,et al.  Support Vector Regression , 2008 .

[65]  Bor-Wen Tsai,et al.  Integrating Decision Tree and Spatial Cluster Analysis for Landslide Susceptibility Zonation , 2009 .

[66]  Xuan Song,et al.  Application of a Hybrid Artificial Neural Network-Particle Swarm Optimization (ANN-PSO) Model in Behavior Prediction of Channel Shear Connectors Embedded in Normal and High-Strength Concrete , 2019, Applied Sciences.

[67]  H. Saito,et al.  Comparison of landslide susceptibility based on a decision-tree model and actual landslide occurrence: The Akaishi Mountains, Japan , 2009 .

[68]  Alexander Brenning,et al.  Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling , 2015, Comput. Geosci..

[69]  Inge Revhaug,et al.  Optimization of Causative Factors for Landslide Susceptibility Evaluation Using Remote Sensing and GIS Data in Parts of Niigata, Japan , 2015, PloS one.

[70]  Dieu Tien Bui,et al.  Landslide Susceptibility Assessment at Mila Basin (Algeria): A Comparative Assessment of Prediction Capability of Advanced Machine Learning Methods , 2018, ISPRS Int. J. Geo Inf..

[71]  Martin Krzywinski,et al.  Points of Significance: Sampling distributions and the bootstrap , 2015, Nature Methods.

[72]  Yang Hong,et al.  Evaluation of the potential of NASA multi‐satellite precipitation analysis in global landslide hazard assessment , 2006 .

[73]  Stephen A. Cook Cobham Alan. The intrinsic computational difficulty of functions. Logic, methodology and philosophy of science, Proceedings of the 1964 International Congress , edited by Bar-Hillel Yehoshua, Studies in logic and the foundations of mathematics, North-Holland Publishing Company, Amsterdam 1965, pp. 2 , 1969 .

[74]  Maggi Kelly,et al.  Support vector machines for predicting distribution of Sudden Oak Death in California , 2005 .

[75]  Alois Knoll,et al.  Gradient boosting machines, a tutorial , 2013, Front. Neurorobot..

[76]  P. Atkinson,et al.  Generalised linear modelling of susceptibility to landsliding in the Central Apennines, Italy , 1998 .

[77]  B. Pradhan,et al.  A comparative assessment of flood susceptibility modeling using Multi-Criteria Decision-Making Analysis and Machine Learning Methods , 2019, Journal of Hydrology.

[78]  B. Pradhan,et al.  Landslide Susceptibility Assessment in Vietnam Using Support Vector Machines, Decision Tree, and Naïve Bayes Models , 2012 .

[79]  Leonard A. Smith,et al.  Increasing the Reliability of Reliability Diagrams , 2007 .

[80]  E. Yesilnacar,et al.  Landslide susceptibility mapping : A comparison of logistic regression and neural networks methods in a medium scale study, Hendek Region (Turkey) , 2005 .

[81]  David H. Wolpert,et al.  The Lack of A Priori Distinctions Between Learning Algorithms , 1996, Neural Computation.

[82]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[83]  Biswajeet Pradhan,et al.  A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India) , 2016, Environ. Model. Softw..

[84]  Wei Chen,et al.  Applying Information Theory and GIS-based quantitative methods to produce landslide susceptibility maps in Nancheng County, China , 2017, Landslides.

[85]  Yunqian Ma,et al.  Selection of Meta-parameters for Support Vector Regression , 2002, ICANN.

[86]  Rudolf Kruse,et al.  A Neuro-Fuzzy Approach to Optimize Hierarchical Recurrent Fuzzy Systems , 2002, Fuzzy Optim. Decis. Mak..

[87]  Nello Cristianini,et al.  Support Vector Machines and Kernel Methods: The New Generation of Learning Machines , 2002, AI Mag..

[88]  Zhanya Xu,et al.  Coupling logistic model tree and random subspace to predict the landslide susceptibility areas with considering the uncertainty of environmental features , 2019, Scientific Reports.

[89]  Maurice Clerc,et al.  The particle swarm - explosion, stability, and convergence in a multidimensional complex space , 2002, IEEE Trans. Evol. Comput..

[90]  Ying Wang,et al.  Landslide susceptibility mapping on a global scale using the method of logistic regression , 2017 .

[91]  K. Allstadt,et al.  Earthquake‐Induced Chains of Geologic Hazards: Patterns, Mechanisms, and Impacts , 2019, Reviews of Geophysics.

[92]  X. Yao Support vector machine modeling of landslide susceptibility using a GIS: A case study , 2006 .

[93]  Xu Zeng-wang,et al.  GIS and ANN model for landslide susceptibility mapping , 2001 .

[94]  Ali P. Yunus,et al.  Torrential rainfall-triggered shallow landslide characteristics and susceptibility assessment using ensemble data-driven models in the Dongjiang Reservoir Watershed, China , 2019, Natural Hazards.

[95]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[96]  Dieu Tien Bui,et al.  A novel hybrid approach of landslide susceptibility modelling using rotation forest ensemble and different base classifiers , 2019, Geocarto International.

[97]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[98]  Matthew J. Cracknell,et al.  Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information , 2014, Comput. Geosci..

[99]  Hecht-Nielsen Theory of the backpropagation neural network , 1989 .

[100]  Ali P. Yunus,et al.  Improved landslide assessment using support vector machine with bagging, boosting, and stacking ensemble machine learning framework in a mountainous watershed, Japan , 2019, Landslides.

[101]  Peter Filzmoser,et al.  Introduction to Multivariate Statistical Analysis in Chemometrics , 2009 .

[102]  R. Soeters,et al.  Landslide hazard and risk zonation—why is it still so difficult? , 2006 .

[103]  Danilo Bzdok,et al.  Points of Significance: Statistics versus machine learning , 2018, Nature Methods.

[104]  B. Pham,et al.  A comparative assessment of decision trees algorithms for flash flood susceptibility modeling at Haraz watershed, northern Iran. , 2018, The Science of the total environment.

[105]  David J. Wald,et al.  Development of a globally applicable model for near real-time prediction of seismically induced landslides , 2014 .

[106]  H. A. Nefeslioglu,et al.  Assessment of Landslide Susceptibility by Decision Trees in the Metropolitan Area of Istanbul, Turkey , 2010 .

[107]  James Kennedy Particle Swarm Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[108]  S. Kazama,et al.  Probabilistic modelling of rainfall induced landslide hazard assessment , 2010 .

[109]  D. R. Hush,et al.  Classification with neural networks: a performance analysis , 1989, IEEE 1989 International Conference on Systems Engineering.

[110]  Jie Dou,et al.  A Comparative Study of Different Machine Learning Algorithms in Predicting the Content of Ilmenite in Titanium Placer , 2020, Applied Sciences.

[111]  Cristiano Ballabio,et al.  Support Vector Machines for Landslide Susceptibility Mapping: The Staffora River Basin Case Study, Italy , 2012, Mathematical Geosciences.

[112]  Y. Hayakawa,et al.  Shallow and Deep-Seated Landslide Differentiation Using Support Vector Machines: A Case Study of the Chuetsu Area, Japan , 2015 .

[113]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[114]  Veronica Tofani,et al.  GIS techniques for regional-scale landslide susceptibility assessment: the Sicily (Italy) case study , 2013, Int. J. Geogr. Inf. Sci..

[115]  Joong-Sun Won,et al.  Development of two artificial neural network methods for landslide susceptibility analysis , 2001, IGARSS 2001. Scanning the Present and Resolving the Future. Proceedings. IEEE 2001 International Geoscience and Remote Sensing Symposium (Cat. No.01CH37217).

[116]  Jie Dou,et al.  New Ensemble Models for Shallow Landslide Susceptibility Modeling in a Semi-Arid Watershed , 2019, Forests.

[117]  Qining Wang,et al.  Inertial sensors-based torso motion mode recognition for an active postural support brace* , 2020, Adv. Robotics.

[118]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[119]  Jie Dou,et al.  Handling high predictor dimensionality in slope-unit-based landslide susceptibility models through LASSO-penalized Generalized Linear Model , 2017, Environ. Model. Softw..

[120]  S. L. Kuriakose,et al.  Spatial data for landslide susceptibility, hazard, and vulnerability assessment: An overview , 2008 .

[121]  Michele Calvello,et al.  Territorial early warning systems for rainfall-induced landslides , 2018 .

[122]  Mikhail Kanevski,et al.  Machine Learning Feature Selection Methods for Landslide Susceptibility Mapping , 2013, Mathematical Geosciences.

[123]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[124]  J. Elith,et al.  Species Distribution Models: Ecological Explanation and Prediction Across Space and Time , 2009 .

[125]  John A. Nelder,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[126]  Jie Dou,et al.  Spatial Proximity-Based Geographically Weighted Regression Model for Landslide Susceptibility Assessment: A Case Study of Qingchuan Area, China , 2020 .

[127]  Amir Hossein Alavi,et al.  Machine learning in geosciences and remote sensing , 2016 .

[128]  O. Korup,et al.  Landslide prediction from machine learning , 2014 .

[129]  Hoang Nguyen,et al.  Estimating the Heating Load of Buildings for Smart City Planning Using a Novel Artificial Intelligence Technique PSO-XGBoost , 2019, Applied Sciences.

[130]  Ataollah Shirzadi,et al.  Development of an Artificial Intelligence Approach for Prediction of Consolidation Coefficient of Soft Soil: A Sensitivity Analysis , 2019, The Open Construction and Building Technology Journal.

[131]  P. Reichenbach,et al.  Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study, Central Italy , 1999 .

[132]  Ram Avtar,et al.  Improved Bathymetric Mapping of Coastal and Lake Environments Using Sentinel-2 and Landsat-8 Images , 2019, Sensors.

[133]  P. Coiffait,et al.  Un bassin post-nappes dans son cadre structural : l'exemple du bassin de Constantine (Algérie nord-orientale) , 1992 .

[134]  Annette M. Molinaro,et al.  Prediction error estimation: a comparison of resampling methods , 2005, Bioinform..

[135]  J. Kendall,et al.  Geophysical Monitoring of Moisture‐Induced Landslides: A Review , 2019, Reviews of Geophysics.