Modeling and Testing Landslide Hazard Using Decision Tree

This paper proposes a decision tree model for specifying the importance of 21 factors causing the landslides in a wide area of Penang Island, Malaysia. These factors are vegetation cover, distance from the fault line, slope angle, cross curvature, slope aspect, distance from road, geology, diagonal length, longitude curvature, rugosity, plan curvature, elevation, rain perception, soil texture, surface area, distance from drainage, roughness, land cover, general curvature, tangent curvature, and profile curvature. Decision tree models are used for prediction, classification, and factors importance and are usually represented by an easy to interpret tree like structure. Four models were created using Chi-square Automatic Interaction Detector (CHAID), Exhaustive CHAID, Classification and Regression Tree (CRT), and Quick-Unbiased-Efficient Statistical Tree (QUEST). Twenty-one factors were extracted using digital elevation models (DEMs) and then used as input variables for the models. A data set of 137570 samples was selected for each variable in the analysis, where 68786 samples represent landslides and 68786 samples represent no landslides. 10-fold cross-validation was employed for testing the models. The highest accuracy was achieved using Exhaustive CHAID (82.0%) compared to CHAID (81.9%), CRT (75.6%), and QUEST (74.0%) model. Across the four models, five factors were identified as most important factors which are slope angle, distance from drainage, surface area, slope aspect, and cross curvature.

[1]  Thomas Hill Statistics: Methods and Applications , 2005 .

[2]  Alberto Carrara,et al.  Multivariate models for landslide hazard evaluation , 1983 .

[3]  Hiromitsu Yamagishi,et al.  Slope failures in the Blue Nile basin, as seen from landscape evolution perspective , 2004 .

[4]  T. Kavzoglu,et al.  Assessment of shallow landslide susceptibility using artificial neural networks in Jabonosa River Basin, Venezuela , 2005 .

[5]  L. Ermini,et al.  Artificial Neural Networks applied to landslide susceptibility assessment , 2005 .

[6]  H. Saito,et al.  Comparison of landslide susceptibility based on a decision-tree model and actual landslide occurrence: The Akaishi Mountains, Japan , 2009 .

[7]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[8]  E. Yesilnacar,et al.  Landslide susceptibility mapping : A comparison of logistic regression and neural networks methods in a medium scale study, Hendek Region (Turkey) , 2005 .

[9]  Mevlut Ture,et al.  Using Kaplan-Meier analysis together with decision tree methods (C&RT, CHAID, QUEST, C4.5 and ID3) in determining recurrence-free survival of breast cancer patients , 2009, Expert Syst. Appl..

[10]  Pramod K. Varshney,et al.  Decision tree regression for soft classification of remote sensing data , 2005 .

[11]  B. Pradhan,et al.  Landslide Susceptibility Assessment in Vietnam Using Support Vector Machines, Decision Tree, and Naïve Bayes Models , 2012 .

[12]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[13]  Pawel Lewicki,et al.  Statistics : methods and applications : a comprehensive reference for science, industry, and data mining , 2006 .

[14]  David Biggs,et al.  A method of choosing multiway partitions for classification and decision trees , 1991 .

[15]  Juan Remondo,et al.  Quantitative landslide risk assessment and mapping on the basis of recent occurrences , 2008 .

[16]  C. F. Lee,et al.  Assessment of landslide susceptibility on the natural terrain of Lantau Island, Hong Kong , 2001 .

[17]  P. Atkinson,et al.  Generalised linear modelling of susceptibility to landsliding in the Central Apennines, Italy , 1998 .

[18]  Po Ken. Pang,et al.  Landslide Hazard Mapping of Penang Island Using Decision Tree Model , 2013 .

[19]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[20]  C. Brodley,et al.  Decision tree classification of land cover from remotely sensed data , 1997 .

[21]  David Alexander,et al.  A brief survey of GIS in mass-movement studies, with reflections on theory and methods , 2008 .

[22]  P. Reichenbach,et al.  Estimating the quality of landslide susceptibility models , 2006 .

[23]  L. A. Goodman Simple Models for the Analysis of Association in Cross-Classifications Having Ordered Categories , 1979 .

[24]  Young-Kwang Yeon,et al.  Landslide susceptibility mapping in Injae, Korea, using a decision tree , 2010 .

[25]  Saro Lee,et al.  Development of GIS-based geological hazard information system and its application for landslide analysis in Korea , 2003 .

[26]  F. Dai,et al.  Assessment of land-slide susceptibility on the natural terrain of Lantau Island , 2001 .

[27]  Biswajeet Pradhan,et al.  A comparative study on the predictive ability of the decision tree, support vector machine and neuro-fuzzy models in landslide susceptibility mapping using GIS , 2013, Comput. Geosci..

[28]  Birgit Terhorst,et al.  Landslide susceptibility assessment using “weights-of-evidence” applied to a study area at the Jurassic escarpment (SW-Germany) , 2007 .

[29]  M. Turrini,et al.  An objective method to rank the importance of the factors predisposing to landslides with the GIS methodology: application to an area of the Apennines (Valnerina; Perugia, Italy) , 2002 .

[30]  Manoj K. Arora,et al.  A comparative study of conventional, ANN black box, fuzzy and combined neural and fuzzy weighting procedures for landslide susceptibility zonation in Darjeeling Himalayas , 2006 .

[31]  Paul M. Mather,et al.  An assessment of the effectiveness of decision tree methods for land cover classification , 2003 .

[32]  Sebastian van der Linden,et al.  Detecting Alpine landforms from remotely sensed imagery. A pilot study in the Bavarian Alps , 2008 .

[33]  Saro Lee,et al.  Application of decision tree model for the ground subsidence hazard mapping near abandoned underground coal mines. , 2013, Journal of environmental management.

[34]  P. Reichenbach,et al.  Probabilistic landslide hazard assessment at the basin scale , 2005 .

[35]  Biswajeet Pradhan,et al.  Application of a neuro-fuzzy model to landslide-susceptibility mapping for shallow landslides in a tropical hilly area , 2011, Comput. Geosci..

[36]  P. Reichenbach,et al.  Landslide hazard evaluation: a review of current techniques and their application in a multi-scale study, Central Italy , 1999 .

[37]  Lucia Luzi,et al.  Slope vulnerability to earthquakes at subregional scale, using probabilistic techniques and geographic information systems , 2000 .

[38]  L. Hurni,et al.  Remote sensing of landslides: An analysis of the potential contribution to geo-spatial systems for hazard assessment in mountainous environments , 2005 .

[39]  P. Aleotti,et al.  Landslide hazard assessment: summary review and new perspectives , 1999 .

[40]  G. V. Kass An Exploratory Technique for Investigating Large Quantities of Categorical Data , 1980 .

[41]  H. A. Nefeslioglu,et al.  Landslide susceptibility mapping for a part of tectonic Kelkit Valley (Eastern Black Sea region of Turkey) , 2008 .

[42]  Saro Lee,et al.  Statistical analysis of landslide susceptibility at Yongin, Korea , 2001 .

[43]  Umi Kalthum Ngah,et al.  Landslide Susceptibility Hazard Mapping Techniques Review , 2012 .

[44]  M. Matteucci,et al.  Artificial neural networks and cluster analysis in landslide susceptibility zonation , 2008 .

[45]  Yu Jin,et al.  Data Mining Technique , 2000 .

[46]  Habibah Lateh,et al.  Landslide hazard mapping of Penang island using probabilistic methods and logistic regression , 2011, 2011 IEEE International Conference on Imaging Systems and Techniques.

[47]  Damien Dhont,et al.  Soil and bedrock distribution estimated from gully form and frequency: A GIS-based decision-tree model for Lebanon , 2008 .

[48]  R. Lewis An Introduction to Classification and Regression Tree (CART) Analysis , 2000 .

[49]  Che-Chern Lin,et al.  Implementation of classifiers for choosing insurance policy using decision trees: a case study , 2008 .

[50]  Haijun Wang,et al.  Application of kernel-based Fisher discriminant analysis to map landslide susceptibility in the Qinggan River delta, Three Gorges, China , 2012 .

[51]  김현철 [서평]「Data Mining Techniques : For Marketing, Sales, and Customer Support」 , 1999 .