Research on a Regional Landslide Early-Warning Model Based on Machine Learning—A Case Study of Fujian Province, China

China’s landslide disasters are serious, and regional landslide disaster early-warning is one of the important means of disaster prevention and mitigation. The traditional regional landslide disaster early-warning model, however, is limited by the complex landslide induction mechanism, limited data accumulation, and insufficient big data analysis methods, and has problems such as limited early-warning accuracy and insufficient refinement. In this paper, a machine learning method was introduced into the field of regional landslide disaster warning. From the model construction process of training sample-set construction, sample learning and training, model parameter optimization, model preservation, warning output, and so on, a method for constructing a regional landslide early-warning model based on machine learning was systematically proposed. In the sample learning and training, 80% of the training sample-set was used as the training set, and 20% was used as the test set for five-fold cross validation. The Bayesian Optimization algorithm was used to optimize the model parameters, and the accuracy, ROC curve, and AUC value were used to verify the model accuracy and model generalization ability. With China’s Fujian province as an example, based on nine years of geological and meteorological data (2010–2018), geological environment factors, factors of hazard-affected bodies and historical disaster situations, and rainfall-induced factors in four categories, a total of 26 indicators were used as input characteristic parameters. Six machine learning algorithms were adopted to improve model training; the results showed that the Random Forest algorithm performed the best, giving an accuracy of 92.3%, and was the model with the best generalization ability (AUC was 0.955). The second best was the Artificial Neural Network model, with an accuracy of 0.937 and an AUC of 0.935. Next were the Nearest Neighbor model, the Logistic Regression model, and the Support Vector Machine; the poorest results were from the Decision Tree model. Finally, the typical rainfall-type landslide disaster process in Fujian Province was selected as an example to verify the Random Forest algorithm model. The results showed that compared with the early-warning results of the original explicit statistical model, the hit rate of the new model was 6 times, or equal to that of the original model, and the landslide density in the early-warning area of the new model was 1.6–1.7 times that of the original model. Preliminary verification showed that the new model based on the Random Forest method has obvious advantages, a higher hit rate and a smaller warning area, and can achieve more accurate warnings. The follow-up will continue to track the new landslide disaster situation in the study area and carry out model verification and correction.

[1]  Fengtai Zhang,et al.  A Hybrid Landslide Warning Model Coupling Susceptibility Zoning and Precipitation , 2022, Forests.

[2]  Deliang Sun,et al.  A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm , 2020, Geomorphology.

[3]  B. Ridwan,et al.  Capability Of Indonesian Landslide Early Warning System To Detect Landslide Occurrences Few Days In Advance , 2019 .

[4]  Hongey Chen,et al.  Adopting the I3–R24 rainfall index and landslide susceptibility for the establishment of an early warning model for rainfall-induced shallow landslides , 2017, Natural Hazards and Earth System Sciences.

[5]  Fausto Guzzetti,et al.  Rainfall thresholds for possible landslide occurrence in Italy , 2017 .

[6]  Wei Chen,et al.  A GIS-based comparative study of Dempster-Shafer, logistic regression and artificial neural network models for landslide susceptibility mapping , 2017 .

[7]  Nguyen Quoc Thanh,et al.  Spatial prediction of rainfall-induced landslides for the Lao Cai area (Vietnam) using a hybrid intelligent approach of least squares support vector machines inference model and artificial bee colony optimization , 2017, Landslides.

[8]  Zohre Sadat Pourtaghi,et al.  Landslide susceptibility assessment in Lianhua County (China); a comparison between a random forest data mining technique and bivariate and multivariate statistical models , 2016 .

[9]  Biswajeet Pradhan,et al.  Spatial prediction models for shallow landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial neural networks, kernel logistic regression, and logistic model tree , 2016, Landslides.

[10]  A. Trigila,et al.  Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy) , 2015 .

[11]  Tom Dijkstra,et al.  The National Landslide Database of Great Britain: Acquisition, communication and the role of social media , 2015 .

[12]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[13]  Pietro Aleotti,et al.  A warning system for rainfall-induced shallow failures , 2004 .

[14]  W. M. Brown,et al.  Real-Time Landslide Warning During Heavy Rainfall , 1987, Science.

[15]  Ivan Marchesini,et al.  Geographical landslide early warning systems , 2020 .

[16]  Wei Pingxi The meteorologic early warning research of sudden geo-hazard in Guangdong province , 2015 .

[17]  C. Margottini,et al.  Landslide Science and Practice: Volume 2: Early Warning, Instrumentation and Monitoring , 2013 .

[18]  Xu Xiwei LOGISTIC REGRESSION MODEL AND ITS VALIDATION FOR HAZARD MAPPING OF LANDSLIDES TRIGGERED BY YUSHU EARTHQUAKE , 2012 .

[19]  L. Bin Application of logistic regression and artificial neural networks in spatial assessment of landslide hazards , 2010 .

[20]  Liu Yan-hui,et al.  Early warning theory for regional geo-hazards anddesign of explicit statistical system , 2007 .

[21]  Zhu Yunfa Study of Distribution Laws and Genesis of Landslides in Fujian Province , 2007 .