A Hybrid Machine Learning Pipeline for Automated Mapping of Events and Locations From Social Media in Disasters

The objective of this study is to propose and test a hybrid machine learning pipeline to uncover the unfolding of disaster events corresponding to different locations from social media posts during disasters. Effective disaster response and recovery require a comprehensive understanding of disaster situations, i.e., unfolding of disaster events and geographic distribution of the disruptions. Existing studies have employed machine learning methods to conduct coarse-grained event detection and analyze the geographical location information from geotagged social media data. However, only a very small fraction of the entire set of social media data includes geotagged information, which may not directly correspond to events described in the content of posts. In addition, the coarse-grained information detected by existing approaches is token-based, which does not provide sufficient information for situation awareness. Hence, the detection of location and finer-grained event information could significantly improve the utility, credibility, and interpretability of social media data for situation awareness. To address these limitations, this study proposed a hybrid machine learning pipeline that makes use of all relevant tweets to uncover the evolution of disaster events across different locations. The pipeline integrates Named Entity Recognition for detecting locations mentioned in the posts, location fusion approach to extract coordinates of the locations and remove noise information, fine-tuned BERT model for classifying posts with humanitarian categories, and graph-based clustering to identify credible situational information. The application of the study is demonstrated using the data set collected from Twitter during the 2017 Hurricane Harvey in Houston. The results show the capability of the proposed hybrid pipeline for automated mapping of events across time and space from social media posts with considerable accuracy. The findings also suggest that the potential for forensic analysis of disasters using mapped events and their evolution, and based on the variation of social media attention to different locations in disasters. Hence, this method could provide a useful tool to support emergency managers, public officials, residents, first responders, and other stakeholders in rapid situation awareness across time and space.

[1]  Ferda Ofli,et al.  Combining Human Computing and Machine Learning to Make Sense of Big (Aerial) Data for Disaster Response , 2016, Big Data.

[2]  Huiji Gao,et al.  Harnessing the Crowdsourcing Power of Social Media for Disaster Relief , 2011, IEEE Intelligent Systems.

[3]  Minh-Son Dao,et al.  A Context-Aware Late-Fusion Approach for Disaster Image Retrieval from Social Media , 2018, ICMR.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Yogesh Kumar Dwivedi,et al.  Event classification and location prediction from tweets during disasters , 2017, Annals of Operations Research.

[6]  Mohammad Ali Abbasi,et al.  TweetTracker: An Analysis Tool for Humanitarian and Disaster Relief , 2011, ICWSM.

[7]  Xiangyang Guan,et al.  Using social media data to understand and assess disasters , 2014, Natural Hazards.

[8]  Qunying Huang,et al.  DisasterMapper: A CyberGIS framework for disaster management using social media data , 2015, BigSpatial@SIGSPATIAL.

[9]  Fernando Diaz,et al.  Extracting information nuggets from disaster- Related messages in social media , 2013, ISCRAM.

[10]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[11]  Erin Smith Crabb,et al.  Using Structural Topic Modeling to Detect Events and Cluster Twitter Users in the Ukrainian Crisis , 2015, HCI.

[12]  Carlos Castillo,et al.  AIDR: artificial intelligence for disaster response , 2014, WWW.

[13]  Huan Liu,et al.  A behavior analytics approach to identifying tweets from crisis regions , 2014, HT.

[14]  Ali Mostafavi,et al.  Social Sensing in Disaster City Digital Twin: Integrated Textual–Visual–Geo Framework for Situational Awareness during Built Environment Disruptions , 2020 .

[15]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[16]  Vassilis Kostakos,et al.  CrisisTracker: Crowdsourced social media curation for disaster awareness , 2013, IBM J. Res. Dev..

[17]  Dirk Draheim,et al.  Towards Disaster Resilient Smart Cities: Can Internet of Things and Big Data Analytics Be the Game Changers? , 2019, IEEE Access.

[18]  Chao Fan,et al.  A Graph-based Approach for Detecting Critical Infrastructure Disruptions on Social Media in Disasters , 2019, HICSS.

[19]  Wenlin Yao,et al.  Social media for intelligent public information and warning in disasters: An interdisciplinary review , 2019, Int. J. Inf. Manag..

[20]  Kathleen M. Carley,et al.  Using tweets to support disaster planning, warning and response , 2016 .

[21]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[22]  Ali Mostafavi,et al.  A graph‐based method for social sensing of infrastructure disruptions in disasters , 2019, Comput. Aided Civ. Infrastructure Eng..

[23]  Muhammad Imran,et al.  Twitter as a Lifeline: Human-annotated Twitter Corpora for NLP of Crisis-related Messages , 2016, LREC.

[24]  L. Palen,et al.  Crisis informatics—New data for extraordinary times , 2016, Science.

[25]  Jiajun Liu,et al.  Understanding Human Mobility from Twitter , 2014, PloS one.

[26]  J. Fowler,et al.  Rapid assessment of disaster damage using social media activity , 2016, Science Advances.

[27]  Mariette Awad,et al.  Damage Identification in Social Media Posts using Multimodal Deep Learning , 2018, ISCRAM.

[28]  Vidhyacharan Bhaskar,et al.  Mining crisis information: A strategic approach for detection of people at risk through social media analysis , 2018 .

[29]  Michiaki Tatsubori,et al.  Location inference using microblog messages , 2012, WWW.

[30]  Cheng Zhang,et al.  Disaster City Digital Twin: A vision for integrating artificial and human intelligence for disaster management , 2021, Int. J. Inf. Manag..

[31]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[32]  Satish Chikkagoudar,et al.  Disentangling the Lexicons of Disaster Response in Twitter , 2014, WWW.

[33]  Jing Gao,et al.  A deep learning approach for detecting traffic accidents from social media data , 2018, ArXiv.

[34]  Firoj Alam,et al.  CrisisMMD: Multimodal Twitter Datasets from Natural Disasters , 2018, ICWSM.

[35]  Kirsi Virrantaus,et al.  Shared situational awareness and information quality in disaster management , 2015 .

[36]  Roshanak Nateghi,et al.  Twitter and Disasters: A Social Resilience Fingerprint , 2019, IEEE Access.

[37]  Yan Wang,et al.  Urban Crisis Detection Technique: A Spatial and Data Driven Approach Based on Latent Dirichlet Allocation (LDA) Topic Modeling , 2018 .

[38]  Jürgen Pfeffer,et al.  Population Bias in Geotagged Tweets , 2015, Proceedings of the International AAAI Conference on Web and Social Media.

[39]  Huan Ning,et al.  A visual–textual fused approach to automated tagging of flood-related tweets during a flood event , 2018, Int. J. Digit. Earth.

[40]  Jie Yin,et al.  Location extraction from disaster-related microblogs , 2013, WWW.

[41]  Seungwon Yang,et al.  Social and geographical disparities in Twitter use during Hurricane Harvey , 2018, Int. J. Digit. Earth.

[42]  Huan Liu,et al.  Discovering Location Information in Social Media , 2015, IEEE Data Eng. Bull..

[43]  Cheng Zhang,et al.  A System Analytics Framework for Detecting Infrastructure-Related Topics in Disasters Using Social Sensing , 2018, EG-ICE.

[44]  Kirill Kireyev Applications of Topics Models to Analysis of Disaster-Related Twitter Data , 2009 .