Building a large-scale corpus for evaluating event detection on twitter

Despite the popularity of Twitter for research, there are very few publicly available corpora, and those which are available are either too small or unsuitable for tasks such as event detection. This is partially due to a number of issues associated with the creation of Twitter corpora, including restrictions on the distribution of the tweets and the difficultly of creating relevance judgements at such a large scale. The difficulty of creating relevance judgements for the task of event detection is further hampered by ambiguity in the definition of event. In this paper, we propose a methodology for the creation of an event detection corpus. Specifically, we first create a new corpus that covers a period of 4 weeks and contains over 120 million tweets, which we make available for research. We then propose a definition of event which fits the characteristics of Twitter, and using this definition, we generate a set of relevance judgements aimed specifically at the task of event detection. To do so, we make use of existing state-of-the-art event detection approaches and Wikipedia to generate a set of candidate events with associated tweets. We then use crowdsourcing to gather relevance judgements, and discuss the quality of results, including how we ensured integrity and prevented spam. As a result of this process, along with our Twitter corpus, we release relevance judgements containing over 150,000 tweets, covering more than 500 events, which can be used for the evaluation of event detection approaches.

[1]  Justus J. Randolph Free-Marginal Multirater Kappa (multirater K[free]): An Alternative to Fleiss' Fixed-Marginal Multirater Kappa. , 2005 .

[2]  F. Pukelsheim The Three Sigma Rule , 1994 .

[3]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[4]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[5]  Giorgio Gambosi,et al.  FUB, IASI-CNR, UNIVAQ at TREC 2011 , 2011 .

[6]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[7]  Prasenjit Mitra,et al.  Temporal and Information Flow Based Event Detection from Social Text Streams , 2007, AAAI.

[8]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[9]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[10]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[11]  Iadh Ounis,et al.  On building a reusable Twitter corpus , 2012, SIGIR '12.

[12]  Giorgio Gambosi,et al.  FUB, IASI-CNR, UNIVAQ at TREC 2011 Microblog Track , 2011, Text Retrieval Conference.

[13]  Danah Boyd,et al.  Tweet, Tweet, Retweet: Conversational Aspects of Retweeting on Twitter , 2010, 2010 43rd Hawaii International Conference on System Sciences.

[14]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[15]  Kwan-Liu Ma,et al.  Breaking news on twitter , 2012, CHI.

[16]  Iadh Ounis,et al.  Overview of the TREC 2011 Microblog Track , 2011, TREC.

[17]  Virgílio A. F. Almeida,et al.  Detecting Spammers on Twitter , 2010 .

[18]  Jing Jiang,et al.  An Empirical Comparison of Topics in Twitter and Traditional Media , 2011 .

[19]  Hila Becker,et al.  Identifying content for planned events across social media sites , 2012, WSDM '12.

[20]  Piet Demeester,et al.  UGent Participation in the Microblog Track 2012 , 2012, TREC.

[21]  Miles Osborne,et al.  Using paraphrases for improving first story detection in news and Twitter , 2012, HLT-NAACL.

[22]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.