Landslide Susceptibility Mapping: Machine and Ensemble Learning Based on Remote Sensing Big Data

Predicting landslide occurrences can be difficult. However, failure to do so can be catastrophic, causing unwanted tragedies such as property damage, community displacement, and human casualties. Research into landslide susceptibility mapping (LSM) attempts to alleviate such catastrophes through the identification of landslide prone areas. Computational modelling techniques have been successful in related disaster scenarios, which motivate this work to explore such modelling for LSM. In this research, the potential of supervised machine learning and ensemble learning is investigated. Firstly, the Flexible Discriminant Analysis (FDA) supervised learning algorithm is trained for LSM and compared against other algorithms that have been widely used for the same purpose, namely Generalized Logistic Models (GLM), Boosted Regression Trees (BRT or GBM), and Random Forest (RF). Next, an ensemble model consisting of all four algorithms is implemented to examine possible performance improvements. The dataset used to train and test all the algorithms consists of a landslide inventory map of 227 landslide locations. From these sources, 13 conditioning factors are extracted to be used in the models. Experimental evaluations are made based on True Skill Statistic (TSS), the Receiver Operation characteristic (ROC) curve and kappa index. The results show that the best TSS (0.6986), ROC (0.904) and kappa (0.6915) were obtained by the ensemble model. FDA on its own seems effective at modelling landslide susceptibility from multiple data sources, with performance comparable to GLM. However, it slightly underperforms when compared to GBM (BRT) and RF. RF seems most capable compared to GBM, GLM, and FDA, when dealing with all conditioning factors.

[1]  David R. Anderson,et al.  AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons , 2011, Behavioral Ecology and Sociobiology.

[2]  Bahareh Kalantar,et al.  Performance Evaluation and Sensitivity Analysis of Expert-Based, Statistical, Machine Learning, and Hybrid Models for Producing Landslide Susceptibility Maps , 2017 .

[3]  B. Taner San,et al.  An evaluation of SVM using polygon-based random sampling in landslide susceptibility mapping: The Candir catchment area (western Antalya, Turkey) , 2014, Int. J. Appl. Earth Obs. Geoinformation.

[4]  Biswajeet Pradhan,et al.  Manifestation of an adaptive neuro-fuzzy model on landslide susceptibility mapping: Klang valley, Malaysia , 2011, Expert Syst. Appl..

[5]  S. Hagen,et al.  Valley and channel networks extraction based on local topographic curvature and k‐means clustering of contours , 2016 .

[6]  Seyed Amir Naghibi,et al.  A Comparative Assessment Between Three Machine Learning Models and Their Performance Comparison by Bivariate and Multivariate Statistical Methods in Groundwater Potential Mapping , 2015, Water Resources Management.

[7]  Wei Chen,et al.  A GIS-based comparative study of Dempster-Shafer, logistic regression and artificial neural network models for landslide susceptibility mapping , 2017 .

[8]  Bahareh Kalantar,et al.  Groundwater potential mapping using a novel data-mining ensemble model , 2018, Hydrogeology Journal.

[9]  Seyed Amir Naghibi,et al.  GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran , 2015, Environmental Monitoring and Assessment.

[10]  A. Brenning,et al.  Integrating physical and empirical landslide susceptibility models using generalized additive models , 2011 .

[11]  Thomas Blaschke,et al.  A Novel Ensemble Approach for Landslide Susceptibility Mapping (LSM) in Darjeeling and Kalimpong Districts, West Bengal, India , 2019, Remote. Sens..

[13]  Rubini Mahalingam,et al.  Evaluation of landslide susceptibility mapping techniques using lidar-derived conditioning factors (Oregon case study) , 2016 .

[14]  Abbas Alimohammadi,et al.  Land cover mapping based on random forest classification of multitemporal spectral and thermal images , 2015, Environmental Monitoring and Assessment.

[15]  B. Pradhan,et al.  Landslide susceptibility assessment at Wadi Jawrah Basin, Jizan region, Saudi Arabia using two bivariate models in GIS , 2015, Geosciences Journal.

[16]  Yi Wang,et al.  Integration of convolutional neural network and conventional machine learning classifiers for landslide susceptibility mapping , 2020, Comput. Geosci..

[17]  Hamid Reza Pourghasemi,et al.  Erratum to: Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia , 2016, Landslides.

[18]  R. Tibshirani,et al.  Flexible Discriminant Analysis by Optimal Scoring , 1994 .

[19]  Zohre Sadat Pourtaghi,et al.  Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia , 2015, Landslides.

[20]  Omri Allouche,et al.  Assessing the accuracy of species distribution models: prevalence, kappa and the true skill statistic (TSS) , 2006 .

[21]  Seyed Amir Naghibi,et al.  A comparative assessment of GIS-based data mining models and a novel ensemble model in groundwater well potential mapping , 2017 .

[22]  Anne H. Schistad Solberg,et al.  Texture fusion and classification based on flexible discriminant analysis , 1996, ICPR.

[23]  Inge Revhaug,et al.  Optimization of Causative Factors for Landslide Susceptibility Evaluation Using Remote Sensing and GIS Data in Parts of Niigata, Japan , 2015, PloS one.

[24]  Alejandro Ruete,et al.  Goal-oriented evaluation of species distribution models’ accuracy and precision: True Skill Statistic profile and uncertainty maps , 2015 .

[25]  Xinhai Li,et al.  Applying various algorithms for species distribution modelling. , 2013, Integrative zoology.

[26]  Jining Yan,et al.  Big Earth Observation Data Integration in Remote Sensing Based on a Distributed Spatial Framework , 2020, Remote. Sens..

[27]  Shengwu Qin,et al.  The Influence of Different Knowledge-Driven Methods on Landslide Susceptibility Mapping: A Case Study in the Changbai Mountain Area, Northeast China , 2019, Entropy.

[28]  Saro Lee,et al.  Landslide susceptibility mapping using random forest and boosted tree models in Pyeong-Chang, Korea , 2018 .

[29]  J A Swets,et al.  Measuring the accuracy of diagnostic systems. , 1988, Science.

[30]  Roland Ngwatung Afungang,et al.  Assessing the spatial probability of landslides using GIS and informative value model in the Bamenda highlands , 2017, Arabian Journal of Geosciences.

[31]  W. Thuiller BIOMOD – optimizing predictions of species distributions and projecting potential future shifts under global change , 2003 .

[32]  Paraskevas Tsangaratos,et al.  Comparison of a logistic regression and Naïve Bayes classifier in landslide susceptibility assessments: The influence of models complexity and training dataset size , 2016 .

[33]  Mark New,et al.  Ensemble forecasting of species distributions. , 2007, Trends in ecology & evolution.

[34]  Wenbin Li,et al.  Landslide Susceptibility Prediction Based on Remote Sensing Images and GIS: Comparisons of Supervised and Unsupervised Machine Learning Models , 2020, Remote. Sens..

[35]  Seyed Amir Naghibi,et al.  Prioritization of landslide conditioning factors and its spatial modeling in Shangnan County, China using GIS-based data mining algorithms , 2018, Bulletin of Engineering Geology and the Environment.

[36]  K. Moffett,et al.  Remote Sens , 2015 .

[37]  Sunil Saha,et al.  Landslide susceptibility mapping using knowledge driven statistical models in Darjeeling District, West Bengal, India , 2019, Geoenvironmental Disasters.

[38]  Mustafa Neamah Jebur,et al.  Earthquake induced landslide susceptibility mapping using an integrated ensemble frequency ratio and logistic regression models in West Sumatera Province, Indonesia , 2014 .

[39]  Bahareh Kalantar,et al.  Assessment of the effects of training data selection on the landslide susceptibility mapping: a comparison between support vector machine (SVM), logistic regression (LR) and artificial neural networks (ANN) , 2018 .

[40]  Dieu Tien Bui,et al.  Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS , 2017 .

[41]  Biswajeet Pradhan,et al.  Novel GIS Based Machine Learning Algorithms for Shallow Landslide Susceptibility Mapping , 2018, Sensors.

[42]  Michael Maerker,et al.  Stochastic assessment of landslides and debris flows in the Jemma basin, Blue Nile, Central Ethiopia , 2016 .

[43]  Mustafa Neamah Jebur,et al.  Optimization of landslide conditioning factors using very high-resolution airborne laser scanning (LiDAR) data at catchment scale , 2014 .

[44]  Alexander Brenning,et al.  Evaluating machine learning and statistical prediction techniques for landslide susceptibility modeling , 2015, Comput. Geosci..

[45]  Soyoung Park,et al.  Landslide Susceptibility Mapping Based on Random Forest and Boosted Regression Tree Models, and a Comparison of Their Performance , 2019, Applied Sciences.

[46]  Umi Kalthum Ngah,et al.  Determination of Important Topographic Factors for Landslide Mapping Analysis Using MLP Network , 2013, TheScientificWorldJournal.

[47]  Simon D. Jones,et al.  Identifying the essential flood conditioning factors for flood prone area mapping using machine learning techniques , 2019, CATENA.

[48]  Dieu Tien Bui,et al.  Development of a Novel Hybrid Intelligence Approach for Landslide Spatial Prediction , 2019, Applied Sciences.

[49]  Alfian Abdul Halin,et al.  Conditioning Factors Determination for Landslide Susceptibility Mapping Using Support Vector Machine Learning , 2019, IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium.

[50]  Markus Meinhardt,et al.  Landslide susceptibility analysis in central Vietnam based on an incomplete landslide inventory: Comparison of a new method to calculate weighting factors by means of bivariate statistics , 2015 .

[51]  A. H. Ehsani,et al.  LANDFORMS IDENTIFICATION USING NEURAL NETWORK-SELF ORGANIZING MAP AND SRTM DATA , 2011 .

[52]  Biswajeet Pradhan,et al.  GIS-based landslide susceptibility mapping using numerical risk factor bivariate model and its ensemble with linear multivariate regression and boosted regression tree algorithms , 2019, Journal of Mountain Science.

[53]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.

[54]  R. Tibshirani,et al.  Discriminant Analysis by Gaussian Mixtures , 1996 .

[55]  Diana Adler,et al.  Using Multivariate Statistics , 2016 .

[56]  P. Sammulal,et al.  Techniques for Machine Learning based Spatial Data Analysis: Research Directions , 2017 .

[57]  Hyung-Sup Jung,et al.  Data Mining Approaches for Landslide Susceptibility Mapping in Umyeonsan, Seoul, South Korea , 2017 .

[58]  Dieu Tien Bui,et al.  A novel hybrid intelligent model of support vector machines and the MultiBoost ensemble for landslide susceptibility modeling , 2019, Bulletin of Engineering Geology and the Environment.

[59]  Hamid Reza Pourghasemi,et al.  A comparison between ten advanced and soft computing models for groundwater qanat potential assessment in Iran using R and GIS , 2018, Theoretical and Applied Climatology.

[60]  Bahareh Kalantar,et al.  Conditioning factor determination for mapping and prediction of landslide susceptibility using machine learning algorithms , 2019, Remote Sensing.

[61]  S. Z. Mousavi,et al.  GIS-based spatial prediction of landslide susceptibility using logistic regression model , 2011 .

[62]  F. Mancini,et al.  GIS and statistical analysis for landslide susceptibility mapping in the Daunia area, Italy , 2010 .

[63]  C. Mallet,et al.  AIRBORNE LIDAR FEATURE SELECTION FOR URBAN CLASSIFICATION USING RANDOM FORESTS , 2009 .

[64]  Mazlan Hashim,et al.  Landslide susceptibility mapping using GIS-based statistical models and Remote sensing data in tropical environment , 2015, Scientific Reports.

[65]  Jon Atli Benediktsson,et al.  Remote Sensing Big Data Classification with High Performance Distributed Deep Learning , 2019, Remote Sensing.