Are OpenStreetMap building data useful for flood vulnerability modelling?

Abstract. Flood risk modelling aims to quantify the probability of flooding and the resulting consequences for exposed elements. The assessment of flood damage is a core task that requires the description of complex flood damage processes including the influences of flooding intensity and vulnerability characteristics. Multi-variable modelling approaches are better suited for this purpose than simple stage–damage functions. However, multi-variable flood vulnerability models require detailed input data and often have problems in predicting damage for regions other than those for which they have been developed. A transfer of vulnerability models usually results in a drop of model predictive performance. Here we investigate the questions as to whether data from the open-data source OpenStreetMap is suitable to model flood vulnerability of residential buildings and whether the underlying standardized data model is helpful for transferring models across regions. We develop a new data set by calculating numerical spatial measures for residential-building footprints and combining these variables with an empirical data set of observed flood damage. From this data set random forest regression models are learned using regional subsets and are tested for predicting flood damage in other regions. This regional split-sample validation approach reveals that the predictive performance of models based on OpenStreetMap building geometry data is comparable to alternative multi-variable models, which use comprehensive and detailed information about preparedness, socio-economic status and other aspects of residential-building vulnerability. The transfer of these models for application in other regions should include a test of model performance using independent local flood data. Including numerical spatial measures based on OpenStreetMap building footprints reduces model prediction errors (MAE – mean absolute error – by 20 % and MSE – mean squared error – by 25 %) and increases the reliability of model predictions by a factor of 1.4 in terms of the hit rate when compared to a model that uses only water depth as a predictor. This applies also when the models are transferred to other regions which have not been used for model learning. Further, our results show that using numerical spatial measures derived from OpenStreetMap building footprints does not resolve all problems of model transfer. Still, we conclude that these variables are useful proxies for flood vulnerability modelling because these data are consistent (i.e. input variables and underlying data model have the same definition, format, units, etc.) and openly accessible and thus make it easier and more cost-effective to transfer vulnerability models to other regions.

[1]  Massimiliano Pittore,et al.  Perspectives on global dynamic exposure modelling for geo-risk assessment , 2017, Natural Hazards.

[2]  H. Kreibich,et al.  Influence of flood frequency on residential building losses , 2010 .

[3]  Abbas Rajabifard,et al.  A framework for a microscale flood damage assessment and visualization for a building using BIM–GIS integration , 2016, Int. J. Digit. Earth.

[4]  Dennis Wagenaar,et al.  Multi-variable flood damage modelling with limited data using supervised learning approaches , 2017 .

[5]  Bruno Merz,et al.  The extreme flood in June 2013 in Germany , 2014 .

[6]  Martin Jung,et al.  LecoS - A python plugin for automated landscape ecology analysis , 2016, Ecol. Informatics.

[7]  Adam Millard-Ball,et al.  The world’s user-generated road map is more than 80% complete , 2017, PloS one.

[8]  Anna Rita Scorzini,et al.  Testing empirical and synthetic flood damage models: the case of Italy , 2019, Natural Hazards and Earth System Sciences.

[9]  Paul C. Boutros,et al.  The parameter sensitivity of random forests , 2016, BMC Bioinformatics.

[10]  Stefan Lüdtke,et al.  Regional and Temporal Transferability of Multivariable Flood Damage Models , 2018 .

[11]  Tuan Ngo,et al.  Calibration and validation of FLFA rs -- a new flood loss function for Australian residential structures , 2016 .

[12]  P. Hoeppe Trends in weather related disasters – Consequences for insurers and society , 2016 .

[13]  Heidi Kreibich,et al.  A Review of Flood Loss Models as Basis for Harmonization and Benchmarking , 2016, PloS one.

[14]  B. Merz,et al.  Flood damage and influencing factors: New insights from the August 2002 flood in Germany , 2005 .

[15]  E. Penning-Rowsell,et al.  Flood risk assessments at different spatial scales , 2015, Mitigation and Adaptation Strategies for Global Change.

[16]  B. Merz,et al.  Tracing the value of data for flood loss modelling , 2016 .

[17]  Animesh K. Gain,et al.  Multi-Variate Analyses of Flood Loss in Can Tho City, Mekong Delta , 2015 .

[18]  Annegret H. Thieken,et al.  Review article: assessing the costs of natural hazards - state of the art and knowledge gaps , 2013 .

[19]  Stefan Lüdtke,et al.  Flood loss estimation using 3D city models and remote sensing data , 2018, Environ. Model. Softw..

[20]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[21]  Heidi Kreibich,et al.  Multi-model ensembles for assessment of flood losses and associated uncertainty , 2018 .

[22]  Edzer Pebesma,et al.  Simple Features for R: Standardized Support for Spatial Vector Data , 2018, R J..

[23]  D. I. Smith Flood damage estimation - A review of urban stage-damage curves and loss functions , 1994 .

[24]  H. Kreibich,et al.  Estimating exposure of residential assets to natural hazards in Europe using open data , 2019, Natural Hazards and Earth System Sciences.

[25]  Bruno Merz,et al.  Multi-variate flood damage assessment: a tree-based data-mining approach , 2013 .

[26]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[27]  Mathieu Basille,et al.  rpostgis: Linking R with a PostGIS Spatial Database , 2018, R J..

[28]  Bruno Merz,et al.  What made the June 2013 flood in Germany an exceptional event? A hydro-meteorological evaluation , 2014 .

[29]  Bruno Merz,et al.  Insurability and Mitigation of Flood Losses in Private Households in Germany , 2006, Risk analysis : an official publication of the Society for Risk Analysis.

[30]  Bertrand Michel,et al.  Correlation and variable importance in random forests , 2013, Statistics and Computing.

[31]  Peter Salamon,et al.  Modelling the socio-economic impact of river floods in Europe , 2016 .

[32]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[33]  J. Chatterton,et al.  The Benefits of Flood Alleviation: A Manual of Assessment Techniques , 1978 .

[34]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[35]  Bruno Merz,et al.  Hierarchical Bayesian Approach for Modeling Spatiotemporal Variability in Flood Damage Processes , 2019, Water Resources Research.

[36]  Stanley A. Changnon,et al.  Shifting Economic Impacts from Weather Extremes in the United States: A Result of Societal Changes, Not Global Warming , 2003 .

[37]  U. Ulbrich,et al.  The central European floods of August 2002: Part 2 –Synoptic causes and considerations with respect to climatic change , 2003 .

[38]  H. Kreibich,et al.  Data Collection for a Better Understanding of What Causes Flood Damage–Experiences with Telephone Surveys , 2017 .

[39]  Z. Kundzewicz,et al.  River flood risk and adaptation in Europe—assessment of the present status , 2010 .

[40]  Robert Hecht,et al.  Measuring Completeness of Building Footprints in OpenStreetMap over Space and Time , 2013, ISPRS Int. J. Geo Inf..

[41]  Heiko Apel,et al.  Flood risk analyses—how detailed do we need to be? , 2009 .

[42]  B. Merz,et al.  Estimation uncertainty of direct monetary flood damage to buildings , 2004 .

[43]  Sarah E. Kienzler,et al.  After the extreme flood in 2002: changes in preparedness, response and recovery of flood-affected residents in Germany between 2005 and 2011 , 2014 .

[44]  M. Goodchild Citizens as sensors: the world of volunteered geography , 2007 .

[45]  H. Kreibich,et al.  Estimating exposure of residential assets to natural hazards in Europe using open data , 2020 .

[46]  Heidi Kreibich,et al.  The flood of June 2013 in Germany: how much do we know about its impacts? , 2016 .

[47]  Bruno Merz,et al.  Review article "Assessment of economic flood damage" , 2010 .

[48]  Barry Rowlingson,et al.  Bindings for the 'Geospatial' Data Abstraction Library [R package rgdal version 1.5-18] , 2020 .

[49]  A. Blanco-Vogt,et al.  Assessment of the physical flood susceptibility of buildings on a large scale - conceptual and methodological frameworks , 2014 .

[50]  H. Winsemius,et al.  A framework for global river flood risk assessments , 2012 .

[51]  Xiaohong Chen,et al.  Flood hazard risk assessment model based on random forest , 2015 .

[52]  Wolfgang Kron,et al.  Flood Risk = Hazard • Values • Vulnerability , 2005 .

[53]  B. Merz,et al.  Coping with floods: preparedness, response and recovery of flood-affected residents in Germany in 2002 , 2007 .

[54]  James B. Brown,et al.  Iterative random forests to discover predictive and stable high-order interactions , 2017, Proceedings of the National Academy of Sciences.

[55]  Annegret H. Thieken,et al.  Identifying Driving Factors in Flood‐Damaging Processes Using Graphical Models , 2018, Water Resources Research.

[56]  Bruno Merz,et al.  Tree‐based flood damage modeling of companies: Damage processes and model performance , 2017 .

[57]  Shinichi Morishita,et al.  On Classification and Regression , 1998, Discovery Science.

[58]  R. Figueiredo,et al.  Using Open Building Data in the Development of Exposure Datasets for Catastrophe Risk Modelling , 2015 .

[59]  A. Thieken,et al.  Adaptability and transferability of flood loss functions in residential areas , 2013 .

[60]  H. Kreibich,et al.  Are flood damage models converging to “reality”? Lessons learnt from a blind test , 2020 .

[61]  Jean-Michel Poggi,et al.  Variable selection using random forests , 2010, Pattern Recognit. Lett..

[62]  Anthony J. Jakeman,et al.  Flood inundation modelling: A review of methods, recent advances and uncertainty analysis , 2017, Environ. Model. Softw..

[63]  Kohske Takahashi,et al.  Welcome to the Tidyverse , 2019, J. Open Source Softw..

[64]  Andreas Paul Zischg,et al.  Are flood damage models converging to “reality”? Lessons learnt from a blind test , 2020, Natural Hazards and Earth System Sciences.

[65]  Brenden Jongman,et al.  Effective adaptation to rising flood risk , 2018, Nature Communications.

[66]  Xuan Linh Nguyen,et al.  Verification of novel integrations of swarm intelligence algorithms into deep learning neural network for flood susceptibility mapping , 2020, Journal of Hydrology.

[67]  Balqis M. Rehan An innovative micro-scale approach for vulnerability and flood risk assessment with the application to property-level protection adoptions , 2018, Natural Hazards.

[68]  G. Zhai,et al.  MODELING FLOOD DAMAGE: CASE OF TOKAI FLOOD 2000 1 , 2005 .

[69]  Hadley Wickham,et al.  Reshaping Data with the reshape Package , 2007 .

[70]  Bruno Merz,et al.  How useful are complex flood damage models? , 2014 .

[71]  Heidi Kreibich,et al.  Development and assessment of uni- and multivariable flood loss models for Emilia-Romagna (Italy) , 2017, Natural Hazards and Earth System Sciences.

[72]  Bruno Merz,et al.  Seamless Estimation of Hydrometeorological Risk Across Spatial Scales , 2019, Earth's Future.

[73]  G. Blöschl,et al.  The June 2013 flood in the Upper Danube Basin, and comparisons with the 2002, 1954 and 1899 floods , 2013 .

[74]  Francesco Dottori,et al.  INSYDE: a synthetic, probabilistic flood damage model based on explicit cost analysis , 2016 .

[75]  Aisling Irwin,et al.  No PhDs needed: how citizen science is transforming research , 2018, Nature.

[76]  Jeroen C. J. H. Aerts,et al.  Comparative flood damage model assessment: towards a European approach , 2012 .

[77]  Daniel Teske Geocoder Accuracy Ranking , 2014, Process Design for Natural Scientists.

[78]  Heidi Kreibich,et al.  Coping with floods in the city of Dresden, Germany , 2009 .

[79]  H. Kreibich,et al.  A Consistent Approach for Probabilistic Residential Flood Loss Modeling in Europe , 2019, Water Resources Research.

[80]  Andreas Paul Zischg,et al.  From global circulation to local flood loss: Coupling models across the scales. , 2018, The Science of the total environment.