EMTerms 1.0: A Terminological Resource for Crisis Tweets

We present the first release of EMTerms (Emergency Management Terms), the largest crisis-related terminological resource to date, containing over 7,000 terms used in Twitter to describe various crises. This resource can be used by practitioners to search for relevant messages in Twitter during crises, and by computer scientists to develop new automatic methods for crises in Twitter. The terms have been collected from a seed set of terms manually annotated by a linguist and an emergency manager from tweets broadcast during 4 crisis events. A Conditional Random Fields (CRF) method was then applied to tweets from 35 crisis events, in order to expand the set of terms while overcoming the difficulty of getting more emergency managers’ annotations. The terms are classified into 23 information-specific categories, by using a combination of expert annotations and crowdsourcing. This article presents the detailed terminology extraction methodology, as well as final results.

[1]  Leysia Palen,et al.  Learning from the crowd: Collaborative filtering techniques for identifying on-the-ground Twitterers during mass disruptions , 2012, ISCRAM.

[2]  Kate Starbird,et al.  Designing for the deluge: understanding & supporting the distributed, collaborative work of crisis volunteers , 2014, CSCW.

[3]  Chen Huang,et al.  Microblogging after a major disaster in China: a case study of the 2010 Yushu earthquake , 2011, CSCW.

[4]  Christian Jacquemin,et al.  Spotting and Discovering Terms through Natural Language Processing , 1997 .

[5]  Kate Starbird,et al.  Delivering patients to sacré coeur: collective intelligence in digital volunteer communities , 2013, CHI.

[6]  Jong-Hoon Oh,et al.  Aid is Out There: Looking for Help from Tweets during a Large Scale Disaster , 2013, ACL.

[7]  Petra Saskia Bayerl,et al.  Social media and the police: tweeting practices of british police forces during the August 2011 riots , 2013, CHI.

[8]  Leysia Palen,et al.  Online public communications by police & fire services during the 2012 Hurricane Sandy , 2014, CHI.

[9]  Carlos Castillo,et al.  AIDR: artificial intelligence for disaster response , 2014, WWW.

[10]  Leysia Palen,et al.  "Voluntweeters": self-organizing by digital volunteers in times of crisis , 2011, CHI.

[11]  Marie-Claude L'Homme,et al.  La terminologie : principes et techniques , 2004 .

[12]  Dietrich Rebholz-Schuhmann,et al.  The BioLexicon: a large-scale terminological resource for biomedical text mining , 2011, BMC Bioinformatics.

[13]  Sarah Vieweg,et al.  Situational Awareness in Mass Emergency: A Behavioral and Linguistic Analysis of Microblogged Communications , 2012 .

[14]  Leysia Palen,et al.  Trial by fire: The deployment of trusted digital volunteers in the 2011 shadow lake fire , 2012, ISCRAM.

[15]  Chei Sian Lee,et al.  Tweet Me Home: Exploring Information Use on Twitter in Crisis Situations , 2011, HCI.

[16]  Junta Mizuno,et al.  NICT Disaster Information Analysis System , 2013, IJCNLP.

[17]  Leysia Palen,et al.  Crowdwork, crisis and convergence: how the connected crowd organizes information during mass disruption events , 2012 .

[18]  Mairano Paolo Proceedings of Euralex 2010 , 2010 .

[19]  Brooke Fisher Liu,et al.  Social media use during disasters: a review of the knowledge base and gaps. , 2012 .

[20]  Sihem Amer-Yahia,et al.  Tweet4act: Using incident-specific profiles for classifying crisis-related messages , 2013, ISCRAM.

[21]  Carlos Castillo,et al.  What to Expect When the Unexpected Happens: Social Media Communications Across Crises , 2015, CSCW.

[22]  Martin R. Gibbs,et al.  Mediating intimacy: designing technologies to support strong-tie relationships , 2005, CHI.

[23]  Sriram Subramanian,et al.  Talking about tactile experiences , 2013, CHI.

[24]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[25]  Irina Temnikova,et al.  First Steps towards Implementing a Sahana Eden Social Media Dashboard , 2013 .

[26]  Arianne Reimerink,et al.  Representing Environmental Knowledge in EcoLexicon , 2014 .

[27]  Adam Kilgarriff,et al.  Getting to Know Your Corpus , 2012, TSD.

[28]  Fernando Diaz,et al.  Extracting information nuggets from disaster- Related messages in social media , 2013, ISCRAM.

[29]  Leysia Palen,et al.  Microblogging during two natural hazards events: what twitter may contribute to situational awareness , 2010, CHI.

[30]  Hideki Mima,et al.  Automatic recognition of multi-word terms:. the C-value/NC-value method , 2000, International Journal on Digital Libraries.

[31]  A. Shapiro,et al.  National Consortium for the Study of Terrorism and Responses to Terrorism , 2010 .

[32]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.

[33]  Leysia Palen,et al.  The Evolving Role of the Public Information Officer: An Examination of Social Media in Emergency Management , 2012 .

[34]  Hermann Hellwagner,et al.  Automatic Identification of Crisis-Related Sub-events Using Clustering , 2012, 2012 11th International Conference on Machine Learning and Applications.

[35]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[36]  Diana Maynard,et al.  NLP Techniques for Term Extraction and Ontology Population , 2008, Ontology Learning and Population.

[37]  Leysia Palen,et al.  Chatter on the red: what hazards threat reveals about the social life of microblogged information , 2010, CSCW '10.

[38]  Piek T. J. M. Vossen,et al.  Bootstrapping Language Neutral Term Extraction , 2010, LREC.

[39]  Irina P. Temnikova,et al.  Building a Crisis Management Term Resource for Social Media: The Case of Floods and Protests , 2014, LREC.

[40]  Fredric C. Gey,et al.  Proceedings of LREC , 2010 .

[41]  Clemens Beckstein,et al.  Cultural analysis and formal standardised language-A mass casualty incident perspective , 2011, ISCRAM.

[42]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[43]  Leysia Palen,et al.  "Beacons of hope" in decentralized coordination: learning from on-the-ground medical twitterers during the 2010 Haiti earthquake , 2012, CSCW.

[44]  Volkmar Pipek,et al.  Dealing with terminologies in collaborative systemsfor crisis management , 2012, ISCRAM.

[45]  Fernando Diaz,et al.  CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises , 2014, ICWSM.

[46]  Leysia Palen,et al.  Mastering social media: An analysis of Jefferson County's communications during the 2013 Colorado floods , 2014, ISCRAM.

[47]  Muhammad Imran,et al.  Coordinating human and machine intelligence to classify microblog communications in crises , 2014, ISCRAM.