Identifying disaster-related tweets and their semantic, spatial and temporal context using deep learning, natural language processing and spatial analysis: a case study of Hurricane Irma

ABSTRACT We introduce an analytical framework for analyzing tweets to (1) identify and categorize fine-grained details about a disaster such as affected individuals, damaged infrastructure and disrupted services; (2) distinguish impact areas and time periods, and relative prominence of each category of disaster-related information across space and time. We first identify disaster-related tweets by generating a human-labeled training dataset and experimenting a series of deep learning and machine learning methods for a binary classification of disaster-relatedness. We employ LSTM (Long Short-Term Memory) networks for the classification task because LSTM networks outperform other methods by considering the whole text structure using long-term semantic word and feature dependencies. Second, we employ an unsupervised multi-label classification of tweets using Latent Dirichlet Allocation (LDA), and identify latent categories of tweets such as affected individuals and disrupted services. Third, we employ spatially-adaptive kernel smoothing and density-based spatial clustering to identify the relative prominence and impact areas for each information category, respectively. Using Hurricane Irma as a case study, we analyze over 500 million keyword-based and geo-located collection of tweets before, during and after the disaster. Our results highlight potential areas with high density of affected individuals and infrastructure damage throughout the temporal progression of the disaster.

[1]  Adam Acar,et al.  Twitter for crisis communication: lessons learned from Japan's tsunami disaster , 2011, Int. J. Web Based Communities.

[2]  Emily Schnebele,et al.  Improving remote sensing flood assessment using volunteered geographical data , 2013 .

[3]  Patric R. Spence,et al.  Learning From the Media in the Aftermath of a Crisis: Findings from the Minneapolis Bridge Collapse , 2009 .

[4]  Sara J. Graves,et al.  Data-Enabled Field Experiment Planning, Management, and Research Using Cyberinfrastructure , 2015 .

[5]  Erika Doggett,et al.  Identifying Eyewitness News-worthy Events on Twitter , 2016, SocialNLP@EMNLP.

[6]  Caglar Koylu,et al.  Modeling and visualizing semantic and spatio-temporal evolution of topics in interpersonal communication on Twitter , 2019, Int. J. Geogr. Inf. Sci..

[7]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[8]  Diansheng Guo,et al.  A novel approach to leveraging social media for rapid flood mapping: a case study of the 2015 South Carolina floods , 2018 .

[9]  Harith Alani,et al.  Semantic Wide and Deep Learning for Detecting Crisis-Information Categories on Social Media , 2017, SEMWEB.

[10]  Reza Zafarani,et al.  Whom should I follow?: identifying relevant users during crises , 2013, HT.

[11]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[12]  Youngok Kang,et al.  Risk analysis and visualization for detecting signs of flood disaster in Twitter , 2016, Spatial Information Research.

[13]  Caglar Koylu Uncovering Geo-Social Semantics from the Twitter Mention Network: An Integrated Approach Using Spatial Network Smoothing and Topic Modeling , 2018 .

[14]  Quan Z. Sheng,et al.  Improving Object and Event Monitoring on Twitter Through Lexical Analysis and User Profiling , 2016, WISE.

[15]  Chetan Tiwari,et al.  Using Spatially Adaptive Filters to Map Late Stage Colorectal Cancer Incidence in Iowa , 2004, SDH.

[16]  Thomas Spielhofer,et al.  Data mining Twitter during the UK floods: Investigating the potential use of social media in emergency management , 2016, 2016 3rd International Conference on Information and Communication Technologies for Disaster Management (ICT-DM).

[17]  Marcus Liwicki,et al.  Scene labeling with LSTM recurrent neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Qunying Huang,et al.  Using Twitter for tasking remote-sensing data collection and damage assessment: 2013 Boulder flood case study , 2016 .

[19]  Shafiq R. Joty,et al.  Applications of Online Deep Learning for Crisis Response Using Social Media Information , 2016, ArXiv.

[20]  Robert Munro,et al.  Short message communications: users, topics, and in-language processing , 2012, ACM DEV '12.

[21]  Cornelia Caragea,et al.  Identifying informative messages in disaster events using Convolutional Neural Networks , 2016 .

[22]  Huan Ning,et al.  A visual–textual fused approach to automated tagging of flood-related tweets during a flood event , 2018, Int. J. Digit. Earth.

[23]  Carlos Castillo,et al.  What to Expect When the Unexpected Happens: Social Media Communications Across Crises , 2015, CSCW.

[24]  Qunying Huang,et al.  A cloud-enabled automatic disaster analysis system of multi-sourced data streams: An example synthesizing social media, remote sensing and Wikipedia data , 2017, Comput. Environ. Urban Syst..

[25]  Andrew Crooks,et al.  Demarcating new boundaries: mapping virtual polycentric communities through social media content , 2013 .

[26]  Wolfgang Nejdl,et al.  Understanding the diversity of tweets in the time of outbreaks , 2013, WWW.

[27]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[28]  Danah Boyd,et al.  The new war correspondents: the rise of civic media curation in urban warfare , 2013, CSCW.

[29]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[30]  Harith Alani,et al.  On Semantics and Deep Learning for Event Detection in Crisis Situations , 2017 .

[31]  Andrew R. Maroko,et al.  Using Geographic Information Science to Estimate Vulnerable Urban Populations for Flood Hazard and Risk Assessment in New York City , 2009 .

[32]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.

[33]  Luke S. Smith,et al.  Assessing the utility of social media as a data source for flood risk management using a real‐time modelling framework , 2017 .

[34]  W. Underwood What Can Topic models of PMLA Teach Us About the History of Literary Scholarship , 2012 .

[35]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.

[36]  Monika Sester,et al.  Extraction of Pluvial Flood Relevant Volunteered Geographic Information (VGI) by Deep Learning from User Generated Texts and Photos , 2018, ISPRS Int. J. Geo Inf..

[37]  C. Haruechaiyasak,et al.  The role of Twitter during a natural disaster: Case study of 2011 Thai Flood , 2012, 2012 Proceedings of PICMET '12: Technology Management for Emerging Technologies.

[38]  João Porto de Albuquerque,et al.  Geo-social media as a proxy for hydrometeorological data for streamflow estimation and to improve flood monitoring , 2018, Comput. Geosci..

[39]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[40]  Alexander Zipf,et al.  A geographic approach for combining social media and authoritative data towards identifying useful information for disaster management , 2015, Int. J. Geogr. Inf. Sci..

[41]  David M. Mimno,et al.  Comparing Apples to Apple: The Effects of Stemmers on Topic Models , 2016, TACL.

[42]  Alexander Zipf,et al.  Exploring the Geographical Relations Between Social Media and Flood Phenomena to Improve Situational Awareness - A Study About the River Elbe Flood in June 2013 , 2014, AGILE Conf..

[43]  Fernando Diaz,et al.  Extracting information nuggets from disaster- Related messages in social media , 2013, ISCRAM.

[44]  J. Fowler,et al.  Rapid assessment of disaster damage using social media activity , 2016, Science Advances.

[45]  Mor Naaman,et al.  Unfolding the event landscape on twitter: classification and exploration of user categories , 2012, CSCW '12.

[46]  Patric R. Spence,et al.  Social media and crisis management: CERC, search strategies, and Twitter content , 2016, Comput. Hum. Behav..

[47]  Amanda Lee Hughes,et al.  In search of the bigger picture: The emergent role of on-line photo sharing in times of disaster , 2008 .

[48]  Viktor Pekar,et al.  Selecting classification features for detection of mass emergencies on social media , 2016 .

[49]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[50]  John Yen,et al.  Classifying text messages for the haiti earthquake , 2011, ISCRAM.

[51]  Bong-Chul Seo,et al.  Real-Time Flood Forecasting and Information System for the State of Iowa , 2017 .

[52]  Fernando Diaz,et al.  CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises , 2014, ICWSM.

[53]  Xiaomo Liu,et al.  Witness Identification in Twitter , 2016, SocialNLP@EMNLP.

[54]  Loni Hagen,et al.  Crisis Communications in the Age of Social Media , 2018 .