On Identifying Disaster-Related Tweets: Matching-Based or Learning-Based?

Social media such as tweets are emerging as platforms contributing to situational awareness during disasters. Information shared on Twitter by both affected population (e.g., requesting assistance, warning) and those outside the impact zone (e.g., providing assistance) would help first responders, decision makers, and the public to understand the situation first-hand. Effective use of such information requires timely selection and analysis of tweets that are relevant to a particular disaster. Even though abundant tweets are promising as a data source, it is challenging to automatically identify relevant messages since tweet are short and unstructured, resulting to unsatisfactory classification performance of conventional learning-based approaches. Thus, we propose a simple yet effective algorithm to identify relevant messages based on matching keywords and hashtags, and provide a comparison between matching-based and learning-based approaches. To evaluate the two approaches, we put them into a framework specifically proposed for analyzing diaster-related tweets. Analysis results on eleven datasets with various disaster types show that our technique provides relevant tweets of higher quality and more interpretable results of sentiment analysis tasks when compared to learning approach.

[1]  Shady Elbassuoni,et al.  Practical extraction of disaster-relevant information from social media , 2013, WWW.

[2]  Lei Zhang,et al.  Sentiment Analysis and Opinion Mining , 2017, Encyclopedia of Machine Learning and Data Mining.

[3]  Mor Naaman,et al.  Unfolding the event landscape on twitter: classification and exploration of user categories , 2012, CSCW '12.

[4]  Cyrus Shahabi,et al.  GeoSocialBound: an efficient framework for estimating social POI boundaries using spatio--textual information , 2016, GeoRich@SIGMOD.

[5]  Huan Liu,et al.  Exploiting social relations for sentiment analysis in microblogging , 2013, WSDM.

[6]  Carlos Castillo,et al.  AIDR: artificial intelligence for disaster response , 2014, WWW.

[7]  John Yen,et al.  Classifying text messages for the haiti earthquake , 2011, ISCRAM.

[8]  Cornelia Caragea,et al.  Identifying informative messages in disaster events using Convolutional Neural Networks , 2016 .

[9]  Adam Acar,et al.  Twitter for crisis communication: lessons learned from Japan's tsunami disaster , 2011, Int. J. Web Based Communities.

[10]  Brooke Fisher Liu,et al.  Social media use during disasters: a review of the knowledge base and gaps. , 2012 .

[11]  Shanshan Zhang,et al.  Semi-supervised Discovery of Informative Tweets During the Emerging Disasters , 2016, ArXiv.

[12]  Shafiq R. Joty,et al.  Applications of Online Deep Learning for Crisis Response Using Social Media Information , 2016, ArXiv.

[13]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[14]  Mor Naaman,et al.  Finding and assessing social media information sources in the context of journalism , 2012, CHI.

[15]  Sarah Vieweg,et al.  Processing Social Media Messages in Mass Emergency , 2014, ACM Comput. Surv..

[16]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[17]  Carlos Castillo,et al.  What to Expect When the Unexpected Happens: Social Media Communications Across Crises , 2015, CSCW.

[18]  Huan Liu,et al.  Is the Sample Good Enough? Comparing Data from Twitter's Streaming API with Twitter's Firehose , 2013, ICWSM.

[19]  Cornelia Caragea,et al.  Mapping moods: Geo-mapped sentiment analysis during hurricane sandy , 2014, ISCRAM.

[20]  Huan Liu,et al.  Unsupervised sentiment analysis with emotional signals , 2013, WWW.

[21]  Ross Maciejewski,et al.  Visualizing Social Media Sentiment in Disaster Scenarios , 2015, WWW.

[22]  Navneet Kaur,et al.  Opinion mining and sentiment analysis , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[23]  Fernando Diaz,et al.  Extracting information nuggets from disaster- Related messages in social media , 2013, ISCRAM.

[24]  Danah Boyd,et al.  The new war correspondents: the rise of civic media curation in urban warfare , 2013, CSCW.

[25]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[26]  Wolfgang Nejdl,et al.  Understanding the diversity of tweets in the time of outbreaks , 2013, WWW.

[27]  Leysia Palen,et al.  "Voluntweeters": self-organizing by digital volunteers in times of crisis , 2011, CHI.

[28]  Denis Parra,et al.  Identifying Relevant Messages in a Twitter-based Citizen Channel for Natural Disaster Situations , 2015, WWW.

[29]  Aron Culotta,et al.  Tweedr: Mining twitter to inform disaster response , 2014, ISCRAM.

[30]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.

[31]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[32]  Beverly Estephany Parilla-Ferrer,et al.  SVM-Based Domain Adaptation Machine Learned Models for the Automatic Classification of Disaster-Related Tweets , 2022 .

[33]  Wolfgang Zenk-Möltgen,et al.  Geotagged Twitter posts from the United States: A tweet collection to investigate representativeness , 2016 .

[34]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[35]  Fernando Diaz,et al.  CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises , 2014, ICWSM.

[36]  Sarah Vieweg,et al.  Situational Awareness in Mass Emergency: A Behavioral and Linguistic Analysis of Microblogged Communications , 2012 .