Modeling landslide susceptibility in data-scarce environments using optimized data mining and statistical methods

Abstract This study evaluated the generalizability of five models to select a suitable approach for landslide susceptibility modeling in data-scarce environments. In total, 418 landslide inventories and 18 landslide conditioning factors were analyzed. Multicollinearity and factor optimization were investigated before data modeling, and two experiments were then conducted. In each experiment, five susceptibility maps were produced based on support vector machine (SVM), random forest (RF), weight-of-evidence (WoE), ridge regression (Rid_R), and robust regression (RR) models. The highest accuracy (AUC = 0.85) was achieved with the SVM model when either the full or limited landslide inventories were used. Furthermore, the RF and WoE models were severely affected when less landslide samples were used for training. The other models were affected slightly when the training samples were limited.

[1]  Wolfgang Kresse,et al.  Springer Handbook of Geographic Information , 2012, Springer Handbooks.

[2]  G. Montana,et al.  Mount Etna volcano (Italy) as a major “dust” point source in the Mediterranean area , 2016, Arabian Journal of Geosciences.

[3]  H. Pourghasemi,et al.  Random forests and evidential belief function-based landslide susceptibility assessment in Western Mazandaran Province, Iran , 2016, Environmental Earth Sciences.

[4]  H. Pourghasemi,et al.  Landslide susceptibility maps using different probabilistic and bivariate statistical models and comparison of their performance at Wadi Itwad Basin, Asir Region, Saudi Arabia , 2016, Bulletin of Engineering Geology and the Environment.

[5]  Nizamettin Aydin,et al.  A novel gene selection algorithm for cancer identification based on random forest and particle swarm optimization , 2015, 2015 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB).

[6]  Biswajeet Pradhan,et al.  A comparative study of different machine learning methods for landslide susceptibility assessment: A case study of Uttarakhand area (India) , 2016, Environ. Model. Softw..

[7]  Alexander Brenning,et al.  Modelling Landslide Susceptibility for a Large Geographical Area Using Weights of Evidence in Lower Austria, Austria , 2015 .

[8]  Q. Cheng,et al.  Conditional Independence Test for Weights-of-Evidence Modeling , 2002 .

[9]  Biswajeet Pradhan,et al.  Landslide susceptibility assessment and factor effect analysis: backpropagation artificial neural networks and their comparison with frequency ratio and bivariate logistic regression modelling , 2010, Environ. Model. Softw..

[10]  Biswajeet Pradhan,et al.  Manifestation of LiDAR-Derived Parameters in the Spatial Prediction of Landslides Using Novel Ensemble Evidential Belief Functions and Support Vector Machine Models in GIS , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[11]  L. Ayalew,et al.  The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan , 2005 .

[12]  Peijun Du,et al.  Rotation-Based Support Vector Machine Ensemble in Classification of Hyperspectral Data With Limited Training Samples , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Suzana Dragicevic,et al.  GIS-based multicriteria evaluation with multiscale analysis to characterize urban landslide susceptibility in data-scarce environments , 2015 .

[14]  Deepak Kumar,et al.  Landslide Susceptibility Mapping & Prediction using Support Vector Machine for Mandakini River Basin, Garhwal Himalaya, India , 2017 .

[15]  P. Griffin,et al.  Use of random forest to estimate population attributable fractions from a case-control study of Salmonella enterica serotype Enteritidis infections , 2015, Epidemiology and Infection.

[16]  Thomas Blaschke,et al.  A GIS-based extended fuzzy multi-criteria evaluation for landslide susceptibility mapping , 2014, Comput. Geosci..

[17]  Umi Kalthum Ngah,et al.  Determination of Important Topographic Factors for Landslide Mapping Analysis Using MLP Network , 2013, TheScientificWorldJournal.

[18]  Dino Bindi,et al.  Landslide susceptibility analysis in data-scarce regions: the case of Kyrgyzstan , 2015, Bulletin of Engineering Geology and the Environment.

[19]  M. Rossi,et al.  Generating event-based landslide maps in a data-scarce Himalayan environment for estimating temporal and magnitude probabilities , 2012 .

[20]  J. McCalpin,et al.  Producing landslide-susceptibility maps for regional planning in data-scarce regions , 2012, Natural Hazards.

[21]  P. Groenen,et al.  The Current and Future Use of Ridge Regression for Prediction in Quantitative Genetics , 2015, BioMed research international.

[22]  Liangjie Wang,et al.  A comparative study of landslide susceptibility maps using logistic regression, frequency ratio, decision tree, weights of evidence and artificial neural network , 2016, Geosciences Journal.

[23]  Antonio Francipane,et al.  Effect of raster resolution and polygon-conversion algorithm on landslide susceptibility mapping , 2016, Environ. Model. Softw..

[24]  Hamid Reza Pourghasemi,et al.  Assessment and comparison of combined bivariate and AHP models with logistic regression for landslide susceptibility mapping in the Chaharmahal-e-Bakhtiari Province, Iran , 2016, Arabian Journal of Geosciences.

[25]  Poonam,et al.  Identification of landslide-prone zones in the geomorphically and climatically sensitive Mandakini valley, (central Himalaya), for disaster governance using the Weights of Evidence method , 2017 .

[26]  Hamid Reza Pourghasemi,et al.  Erratum to: Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia , 2016, Landslides.

[27]  D. R. Cutler,et al.  Utah State University From the SelectedWorks of , 2017 .

[28]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[29]  Vincent Baeten,et al.  Combination of support vector machines (SVM) and near‐infrared (NIR) imaging spectroscopy for the detection of meat and bone meal (MBM) in compound feeds , 2004 .

[30]  P. Reichenbach,et al.  Different landslide sampling strategies in a grid-based bi-variate statistical susceptibility model , 2016 .

[31]  G. Heuvelink,et al.  A generic framework for spatial prediction of soil variables based on regression-kriging , 2004 .

[32]  Ebru Akcapinar Sezer,et al.  A modified analytical hierarchy process (M-AHP) approach for decision support systems in natural hazard assessments , 2013, Comput. Geosci..

[33]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[34]  Mustafa Neamah Jebur,et al.  Earthquake induced landslide susceptibility mapping using an integrated ensemble frequency ratio and logistic regression models in West Sumatera Province, Indonesia , 2014 .

[35]  Bo Du,et al.  Target Detection Based on Random Forest Metric Learning , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[36]  Saro Lee,et al.  Landslide susceptibility analysis and verification using the Bayesian probability model , 2002 .

[37]  Biswajeet Pradhan,et al.  A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping , 2014, Landslides.

[38]  G. Bonham-Carter Geographic Information Systems for Geoscientists: Modelling with GIS , 1995 .

[39]  B. Pradhan,et al.  Landslide susceptibility mapping at Al-Hasher area, Jizan (Saudi Arabia) using GIS-based frequency ratio and index of entropy models , 2015, Geosciences Journal.

[40]  Ronald M. Summers,et al.  Automated segmentation of the thyroid gland on CT using multi-atlas label fusion and random forest , 2015, 2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI).

[41]  Nachiappan Subramanian,et al.  A review of applications of Analytic Hierarchy Process in operations management , 2012 .

[42]  George L. W. Perry,et al.  Identifying the controls on coastal cliff landslides using machine-learning approaches , 2016, Environ. Model. Softw..

[43]  Birgit Terhorst,et al.  Landslide susceptibility assessment using “weights-of-evidence” applied to a study area at the Jurassic escarpment (SW-Germany) , 2007 .

[44]  Zohre Sadat Pourtaghi,et al.  Landslide susceptibility assessment in Lianhua County (China); a comparison between a random forest data mining technique and bivariate and multivariate statistical models , 2016 .

[45]  M. Joseph,et al.  Biochemical and stable carbon isotope records of mangrove derived organic matter in the sediment cores , 2016, Environmental Earth Sciences.

[46]  D. Altman,et al.  Measuring agreement in method comparison studies , 1999, Statistical methods in medical research.