Social Media Integration of Flood Data: A Vine Copula-Based Approach

Floods are the most common and among the most severe natural disasters in many countries around the world. As global warming continues to exacerbate sea level rise and extreme weather, governmental authorities and environmental agencies are facing the pressing need of timely and accurate evaluations and predictions of flood risks. Current flood forecasts are generally based on historical measurements of environmental variables at monitoring stations. In recent years, in addition to traditional data sources, large amounts of information related to floods have been made available via social media. Members of the public are constantly and promptly posting information and updates on local environmental phenomena on social media platforms. Despite the growing interest of scholars towards the usage of online data during natural disasters, the majority of studies focus exclusively on social media as a stand-alone data source, while its joint use with other type of information is still unexplored. In this paper we propose to fill this gap by integrating traditional historical information on floods with data extracted by Twitter and Google Trends. Our methodology is based on vine copulas, that allow us to capture the dependence structure among the marginals, which are modelled via appropriate time series methods, in a very flexible way. We apply our methodology to data related to three different coastal locations on the South coast of the United Kingdom (UK). The results show that our approach, based on the integration of social media data, outperforms traditional methods in terms of evaluation and prediction of flood events.

[1]  H. Winsemius,et al.  Dependence between high sea-level and high river discharge increases flood hazard in global deltas and estuaries , 2018, Environmental Research Letters.

[2]  Claudia Czado,et al.  Analyzing Dependent Data with Vine Copulas , 2019, Lecture Notes in Statistics.

[3]  Thomas Spielhofer,et al.  Data mining Twitter during the UK floods: Investigating the potential use of social media in emergency management , 2016, 2016 3rd International Conference on Information and Communication Technologies for Disaster Management (ICT-DM).

[4]  Rudy Arthur,et al.  Social sensing of floods in the UK , 2017, PloS one.

[5]  J. Nash,et al.  River flow forecasting through conceptual models part I — A discussion of principles☆ , 1970 .

[6]  Shahfahad,et al.  Flood susceptibility modeling in Teesta River basin, Bangladesh using novel ensembles of bagging algorithms , 2020, Stochastic Environmental Research and Risk Assessment.

[7]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[8]  Claudia Czado,et al.  Selecting and estimating regular vine copulae and application to financial returns , 2012, Comput. Stat. Data Anal..

[9]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[10]  R. Rigby,et al.  Generalized additive models for location, scale and shape , 2005 .

[11]  R. Deo,et al.  Development of Flood Monitoring Index for daily flood risk evaluation: case studies in Fiji , 2020, Stochastic Environmental Research and Risk Assessment.

[12]  Paul D. Bates,et al.  Near Real-Time Flood Detection in Urban and Rural Areas Using High-Resolution Synthetic Aperture Radar Images , 2012, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Social sensing of high-impact rainfall events worldwide: A benchmark comparison against manually curated impact observations , 2021 .

[14]  George Athanasopoulos,et al.  Forecasting: principles and practice , 2013 .

[15]  Philip A. Yates,et al.  Standard error estimation for mixed flood distributions with historic maxima , 2015 .

[16]  M. Sklar Fonctions de repartition a n dimensions et leurs marges , 1959 .

[17]  Alexander Zipf,et al.  Exploring the Geographical Relations Between Social Media and Flood Phenomena to Improve Situational Awareness - A Study About the River Elbe Flood in June 2013 , 2014, AGILE Conf..

[18]  Alexander Zipf,et al.  A geographic approach for combining social media and authoritative data towards identifying useful information for disaster management , 2015, Int. J. Geogr. Inf. Sci..

[19]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[20]  A. Frigessi,et al.  Pair-copula constructions of multiple dependence , 2009 .

[21]  T. Stocker,et al.  Managing the risks of extreme events and disasters to advance climate change adaptation. Special report of the Intergovernmental Panel on Climate Change. , 2012 .

[22]  A. Sebastian,et al.  A Copula-Based Bayesian Network for Modeling Compound Flood Hazard from Riverine and Coastal Interactions at the Catchment Scale: An Application to the Houston Ship Channel, Texas , 2018, Water.

[23]  Mathieu Vrac,et al.  Multivariate statistical modelling of compound events via pair-copula constructions: analysis of floods in Ravenna (Italy) , 2017 .

[24]  B. Gouldby,et al.  Exploring the Potential for Multivariate Fragility Representations to Alter Flood Risk Estimates , 2018, Risk analysis : an official publication of the Society for Risk Analysis.

[25]  Jonathan A. Tawn,et al.  A conditional approach for multivariate extreme values (with discussion) , 2004 .

[26]  Marc Goovaerts,et al.  Insurance: Mathematics and Economics , 2006 .

[27]  T. Wahl,et al.  Assessing compound flooding potential with multivariate statistical models in a complex estuarine system under data constraints , 2021, Journal of Flood Risk Management.

[28]  H. Joe Multivariate Models and Multivariate Dependence Concepts , 1997 .

[29]  Luke S. Smith,et al.  Assessing the utility of social media as a data source for flood risk management using a real‐time modelling framework , 2017 .

[30]  George Valkanas,et al.  Twitter Floods when it Rains: A Case Study of the UK Floods in early 2014 , 2015, WWW.

[31]  Gillian Z. Heller,et al.  Distributions for Modeling Location, Scale, and Shape , 2019 .

[32]  P. Bates,et al.  Progress in integration of remote sensing–derived flood extent and stage data and hydraulic models , 2009 .

[33]  David Robinson,et al.  tidytext: Text Mining and Analysis Using Tidy Data Principles in R , 2016, J. Open Source Softw..

[34]  Xinhao Wang,et al.  An Internet Based Flood Warning System , 2003 .

[35]  Firoj Alam,et al.  CrisisMMD: Multimodal Twitter Datasets from Natural Disasters , 2018, ICWSM.

[36]  D. Simmonds,et al.  A copula-based approach for the estimation of wave height records through spatial correlation , 2016 .

[37]  M. Haugh,et al.  An Introduction to Copulas , 2016 .

[38]  Multivariate modeling of flood characteristics using Vine copulas , 2020, Environmental Earth Sciences.

[39]  A. Raftery,et al.  Strictly Proper Scoring Rules, Prediction, and Estimation , 2007 .

[40]  Bruno Rémillard,et al.  Forecasting Time Series with Multivariate Copulas , 2013 .

[41]  Diansheng Guo,et al.  A novel approach to leveraging social media for rapid flood mapping: a case study of the 2015 South Carolina floods , 2018 .

[42]  D. Leibovici,et al.  Rapid flood inundation mapping using social media, remote sensing and topographic data , 2017, Natural Hazards.

[43]  Rob Lamb,et al.  Estimating the probability of widespread flood events , 2013 .

[44]  J. S. Verkade,et al.  Probabilistic flood extent estimates from social media flood observations , 2016 .

[45]  Liming Dai,et al.  Modelling Dependence between Traffic Noise and Traffic Flow through An Entropy-Copula Method , 2017 .

[46]  H. Joe,et al.  The Estimation Method of Inference Functions for Margins for Multivariate Models , 1996 .

[47]  Andrew M. Sibley,et al.  Coastal flooding in England and Wales from Atlantic and North Sea storms during the 2013/2014 winter , 2015 .

[48]  Shahid Latif,et al.  Parametric Vine Copula Construction for Flood Analysis for Kelantan River Basin in Malaysia , 2020 .

[49]  Peng Shi,et al.  Nonstationary flood coincidence risk analysis using time-varying copula functions , 2020, Scientific Reports.

[50]  A. Balogun,et al.  An Improved Flood Susceptibility Model for Assessing the Correlation of Flood Hazard and Property Prices using Geospatial Technology and Fuzzy-ANP , 2020 .

[51]  Giles M. Foody,et al.  Crowdsourcing for climate and atmospheric sciences: current status and future potential , 2015 .