Spatial prediction of flood susceptibility using random-forest and boosted-tree models in Seoul metropolitan city, Korea

ABSTRACT Since flood frequency increases with the impact of climate change, the damage that is emphasized on flood-risk maps is based on actual flooded area data; therefore, flood-susceptibility maps for the Seoul metropolitan area, for which random-forest and boosted-tree models are used in a geographic information system (GIS) environment, are created for this study. For the flood-susceptibility mapping, flooded-area, topography, geology, soil and land-use datasets were collected and entered into spatial datasets. From the spatial datasets, 12 factors were calculated and extracted as the input data for the models. The flooded area of 2010 was used to train the model, and the flooded area of 2011 was used for the validation. The importance of the factors of the flood-susceptibility maps was calculated and lastly, the maps were validated. As a result, the distance from the river, geology and digital elevation model showed a high importance among the factors. The random-forest model showed validation accuracies of 78.78% and 79.18% for the regression and classification algorithms, respectively, and boosted-tree model showed validation accuracies of 77.55% and 77.26% for the regression and classification algorithms, respectively. The flood-susceptibility maps provide meaningful information for decision-makers regarding the identification of priority areas for flood-mitigation management.

[1]  H. Pourghasemi,et al.  Flash flood susceptibility analysis and its mapping using different bivariate models in Iran: a comparison between Shannon’s entropy, statistical index, and weighting factor models , 2016, Environmental Monitoring and Assessment.

[2]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[3]  Julian D Olden,et al.  Machine Learning Methods Without Tears: A Primer for Ecologists , 2008, The Quarterly Review of Biology.

[4]  Gert R. G. Lanckriet,et al.  A random forest classifier for the prediction of energy expenditure and type of physical activity from wrist and hip accelerometers , 2014, Physiological measurement.

[5]  A. Blanco-Vogt,et al.  Assessment of the physical flood susceptibility of buildings on a large scale - conceptual and methodological frameworks , 2014 .

[6]  Mustafa Neamah Jebur,et al.  Spatial prediction of flood susceptible areas using rule based decision tree (DT) and a novel ensemble bivariate and multivariate statistical models in GIS , 2013 .

[7]  Moung-Jin Lee,et al.  Application of fuzzy combination operators to flood vulnerability assessments in Seoul, Korea , 2015 .

[8]  김대성,et al.  홍수매핑을 위한 레이더 영상 필터의 비교분석 , 2016 .

[9]  C. Pharino,et al.  Assessment of the flood vulnerability of shrimp farms using a multicriteria evaluation and GIS: a case study in the Bangpakong Sub-Basin, Thailand , 2016, Environmental Earth Sciences.

[10]  Yi Li,et al.  Flood Mapping Based on Multiple Endmember Spectral Mixture Analysis and Random Forest Classifier - The Case of Yuyao, China , 2015, Remote. Sens..

[11]  Glenn De'ath,et al.  Water quality as a regional driver of coral biodiversity and macroalgae on the Great Barrier Reef. , 2010, Ecological applications : a publication of the Ecological Society of America.

[12]  Hamid Reza Pourghasemi,et al.  Erratum to: Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia , 2016, Landslides.

[13]  D. A. Novelo-Casanova,et al.  Flood risk assessment. Case of study: Motozintla de Mendoza, Chiapas, Mexico , 2016 .

[14]  Seyed Amir Naghibi,et al.  GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran , 2015, Environmental Monitoring and Assessment.

[15]  John C. Davis,et al.  Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas, USA , 2003 .

[16]  Mariana Belgiu,et al.  Random forest in remote sensing: A review of applications and future directions , 2016 .

[17]  Hyun-Joo Oh,et al.  Detection of landslides using web-based aerial photographs and landslide susceptibility mapping using geospatial analysis , 2012 .

[18]  Biswajeet Pradhan,et al.  Hybrid artificial intelligence approach based on neural fuzzy inference model and metaheuristic optimization for flood susceptibilitgy modeling in a high-frequency tropical cyclone area using GIS , 2016 .

[19]  Klara Dolos,et al.  Comparing Generalized Linear Models and random forest to model vascular plant species richness using LiDAR data in a natural forest in central Chile , 2016 .

[20]  Eugene M. Kleinberg,et al.  On the Algorithmic Implementation of Stochastic Discrimination , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  Florian Pappenberger,et al.  Flood hazard mapping combining high resolution multi-temporal SAR data and coarse resolution global hydrodynamic modelling , 2014, 2014 IEEE Geoscience and Remote Sensing Symposium.

[22]  K. Sowmya,et al.  Urban flood vulnerability zoning of Cochin City, southwest coast of India, using remote sensing and GIS , 2014, Natural Hazards.

[23]  F. Smedt,et al.  Flood Modeling for Complex Terrain Using GIS and Remote Sensed Information , 2005 .

[24]  Michael A. Wulder,et al.  Characterizing stand-level forest canopy cover and height using Landsat time series, samples of airborne LiDAR, and the Random Forest algorithm , 2015 .

[25]  G. Zeng,et al.  A GIS-Based Spatial Multi-Criteria Approach for Flood Risk Assessment in the Dongting Lake Region, Hunan, Central China , 2011 .

[26]  Nir Friedman,et al.  Data Analysis with Bayesian Networks: A Bootstrap Approach , 1999, UAI.

[27]  T. Chen,et al.  Spatial heterogeneity of local flood vulnerability indicators within flood-prone areas in Taiwan , 2016, Environmental Earth Sciences.

[28]  Paul D. Bates,et al.  Flood risk assessment , 2004 .

[29]  M. Conforti,et al.  Application and validation of bivariate GIS-based landslide susceptibility assessment for the Vitravo river catchment (Calabria, south Italy) , 2012, Natural Hazards.

[30]  B. Pradhan,et al.  Flash flood susceptibility assessment in Jeddah city (Kingdom of Saudi Arabia) using bivariate and multivariate statistical models , 2015, Environmental Earth Sciences.

[31]  B. Pradhan Flood susceptible mapping and risk area delineation using logistic regression, GIS and remote sensing , 2010 .

[32]  Jianhua Gong,et al.  Urban Flood Mapping Based on Unmanned Aerial Vehicle Remote Sensing and Random Forest Classifier—A Case of Yuyao, China , 2015 .

[33]  J Elith,et al.  A working guide to boosted regression trees. , 2008, The Journal of animal ecology.

[34]  S. Déry,et al.  Flooding in the Nechako River Basin of Canada: A random forest modeling approach to flood analysis in a regulated reservoir system , 2016 .

[35]  K. Beven,et al.  A physically based, variable contributing area model of basin hydrology , 1979 .

[36]  M. Magni,et al.  A rapid method for flood susceptibility mapping in two districts of Phatthalung Province (Thailand): present and projected conditions for 2050 , 2016, Natural Hazards.

[37]  W. L. Chadderton,et al.  Dispersal, disturbance and the contrasting biogeographies of New Zealand’s diadromous and non‐diadromous fish species , 2008 .

[38]  Lise Getoor,et al.  Learning Probabilistic Relational Models , 1999, IJCAI.

[39]  Jingrui He,et al.  Comparing Random Forest with Logistic Regression for Predicting Class-Imbalanced Civil War Onset Data , 2016, Political Analysis.

[40]  B. Merz,et al.  Large-scale, seasonal flood risk analysis for agricultural crops in Germany , 2016, Environmental Earth Sciences.

[41]  W. H. Wischmeier,et al.  Predicting rainfall erosion losses : a guide to conservation planning , 1978 .

[42]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[43]  M. Luoto,et al.  Determinants of sediment properties and organic matter in beach and dune environments based on boosted regression trees , 2015 .

[44]  Matej Vojtek,et al.  Flood hazard and flood risk assessment at the local spatial scale: a case study , 2016 .

[45]  Josef van Genabith,et al.  SAARSHEFF at SemEval-2016 Task 1: Semantic Textual Similarity with Machine Translation Evaluation Metrics and (eXtreme) Boosted Tree Ensembles , 2016, *SEMEVAL.

[46]  J. Adinarayana,et al.  Integration of multi-seasonal remotely-sensed images for improved landuse classification of a hilly watershed using geographical information systems , 1996 .

[47]  Kuniyoshi Takeuchi,et al.  Assessment of flood hazard, vulnerability and risk of mid-eastern Dhaka using DEM and 1D hydrodynamic model , 2012, Natural Hazards.

[48]  Key factors affecting the flood vulnerability and adaptation of the shrimp farming sector in Thailand , 2016 .

[49]  Saro Lee,et al.  Landslide susceptibility mapping in the Damrei Romel area, Cambodia using frequency ratio and logistic regression models , 2006 .

[50]  Andrea Petroselli,et al.  Flood mapping in ungauged basins using fully continuous hydrologic–hydraulic modeling , 2013 .

[51]  R. Hastie Problems for judgment and decision making. , 2001, Annual review of psychology.

[52]  George J. Papaioannou,et al.  The effect of riverine terrain spatial resolution on flood modeling and mapping , 2013, Other Conferences.

[53]  Omid Rahmati,et al.  Flood hazard zoning in Yasooj region, Iran, using GIS and multi-criteria decision analysis , 2016 .

[54]  Zohre Sadat Pourtaghi,et al.  Landslide susceptibility mapping using random forest, boosted regression tree, classification and regression tree, and general linear models and comparison of their performance at Wadi Tayyah Basin, Asir Region, Saudi Arabia , 2015, Landslides.

[55]  H. Pourghasemi,et al.  Flood susceptibility mapping using frequency ratio and weights-of-evidence models in the Golastan Province, Iran , 2016 .

[56]  Abdullah Soykan,et al.  GIS-based approach for flood analysis: case study of Keçidere flash flood event (Turkey) , 2016 .

[57]  I. Moore,et al.  Digital terrain modelling: A review of hydrological, geomorphological, and biological applications , 1991 .

[58]  Panos Panagos,et al.  Modeling soil erosion and river sediment yield for an intermountain drainage basin of the Central Apennines, Italy , 2014 .

[59]  Filippo Catani,et al.  Rapid assessment of flood susceptibility in urbanized rivers using digital terrain data: Application to the Arno river case study (Firenze, northern Italy) , 2014 .

[60]  Saro Lee,et al.  Application of Decision-Tree Model to Groundwater Productivity-Potential Mapping , 2015 .

[61]  Mahyat Shafapour Tehrany,et al.  Flood susceptibility assessment using GIS-based support vector machine model with different kernel types , 2015 .

[62]  Xiaohong Chen,et al.  Flood hazard risk assessment model based on random forest , 2015 .

[63]  A. Balogun,et al.  Analysis of the flood extent extraction model and the natural flood influencing factors: A GIS-based and remote sensing analysis , 2014 .

[64]  Terrence Fong,et al.  Automatic boosted flood mapping from satellite data , 2016, International journal of remote sensing.

[65]  Mustafa Neamah Jebur,et al.  Flood susceptibility mapping using integrated bivariate and multivariate statistical models , 2014, Environmental Earth Sciences.

[66]  C. Cao,et al.  Flash Flood Hazard Susceptibility Mapping Using Frequency Ratio and Statistical Index Methods in Coalmine Subsidence Areas , 2016 .