Enabling Rapid Classification of Social Media Communications During Crises

The use of social media platforms such as Twitter by affected people during crises is considered a vital source of information for crisis response. However, rapid crisis response requires real-time analysis of online information. When a disaster happens, among other data processing techniques, supervised machine learning can help classify online information in real-time. However, scarcity of labeled data causes poor performance in machine training. Often labeled data from past event is available. Can past labeled data be reused to train classifiers? We study the usefulness of labeled data of past events. We observe the performance of our classifiers trained using different combinations of training sets obtained from past disasters. Moreover, we propose two approaches (target labeling and active learning) to boost classification performance of a learning scheme. We perform extensive experimentation on real crisis datasets and show the utility of past-labeled data to train machine learning classifiers to process sudden-onset crisis-related data in real-time. KEywoRDS Social media, tweets classification, domain adaptation, disaster response

[1]  Daniel Marcu,et al.  Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[2]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[3]  Amanda Lee Hughes,et al.  Crisis in a Networked World , 2009 .

[4]  Cornelia Caragea,et al.  Twitter Mining for Disaster Response: A Domain Adaptation Approach , 2015, ISCRAM.

[5]  Jie Yin,et al.  Emergency situation awareness from twitter for crisis management , 2012, WWW.

[6]  Sarah Vieweg,et al.  Processing Social Media Messages in Mass Emergency , 2014, ACM Comput. Surv..

[7]  Leysia Palen,et al.  Twitter‐based information distribution during the 2009 Red River Valley flood threat , 2010 .

[8]  Leysia Palen,et al.  Twitter adoption and use in mass convergence and emergency events , 2009 .

[9]  David A. Cohn,et al.  Improving generalization with active learning , 1994, Machine Learning.

[10]  Muhammad Imran,et al.  Engineering Crowdsourced Stream Processing Systems , 2013, ArXiv.

[11]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.

[12]  Scott Counts,et al.  Tweeting is believing?: understanding microblog credibility perceptions , 2012, CSCW.

[13]  L. Palen Online Social Media in Crisis Events. , 2008 .

[14]  Chen Huang,et al.  Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake , 2011, CSCW.

[15]  Robert Power,et al.  Emergency Situation Awareness: Twitter Case Studies , 2014, ISCRAM-med.

[16]  Sihem Amer-Yahia,et al.  Tweet4act: Using incident-specific profiles for classifying crisis-related messages , 2013, ISCRAM.

[17]  Leysia Palen,et al.  Chatter on the red: what hazards threat reveals about the social life of microblogged information , 2010, CSCW '10.

[18]  Carlos Castillo,et al.  What to Expect When the Unexpected Happens: Social Media Communications Across Crises , 2015, CSCW.

[19]  Barbara Poblete,et al.  Twitter under crisis: can we trust what we RT? , 2010, SOMA '10.

[20]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[21]  Jie Yin,et al.  Using Social Media to Enhance Emergency Situation Awareness , 2012, IEEE Intelligent Systems.

[22]  Xiao Zhang,et al.  SensePlace2: GeoTwitter analytics support for situational awareness , 2011, 2011 IEEE Conference on Visual Analytics Science and Technology (VAST).

[23]  Burr Settles,et al.  Active Learning Literature Survey , 2009 .

[24]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[25]  Edward A. Fox,et al.  Social media use by government: From the routine to the critical , 2012, Gov. Inf. Q..

[26]  Carlos Castillo,et al.  AIDR: artificial intelligence for disaster response , 2014, WWW.

[27]  Muhammad Imran,et al.  Integrating Social Media Communications into the Rapid Assessment of Sudden Onset Disasters , 2014, SocInfo.

[28]  Graham Neubig,et al.  Safety Information Mining — What can NLP do in a disaster— , 2011, IJCNLP.

[29]  J. Brownstein,et al.  Digital disease detection--harnessing the Web for public health surveillance. , 2009, The New England journal of medicine.