An evaluation of the run-time and task-based performance of event detection techniques for Twitter

Twitter's increasing popularity as a source of up-to-date news and information about current events has spawned a body of research on event detection techniques for social media data streams. Although all proposed approaches provide some evidence as to the quality of the detected events, none relate this task-based performance to their run-time performance in terms of processing speed, data throughput, or memory usage. In particular, neither a quantitative nor a comparative evaluation of these aspects has been performed to date. In this article, we study the run-time and task-based performance of several state-of-the-art event detection techniques for Twitter. In order to reproducibly compare run-time performance, our approach is based on a general-purpose data stream management system, whereas task-based performance is automatically assessed based on a series of novel measures. HighlightsStreaming implementations of state-of-the-art event detection techniques for Twitter.Study of the task-based and run-time performance of event detection techniques.Scalable measures to assess performance of event detection techniques automatically.Platform-based approach to enable further performance studies for future techniques.

[1]  Marc H. Scholl,et al.  Event identification for local areas using social media streaming data , 2013, DBSocial '13.

[2]  Cécile Favre,et al.  Mention-anomaly-based Event Detection and tracking in Twitter , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[3]  Ying Xing,et al.  The Design of the Borealis Stream Processing Engine , 2005, CIDR.

[4]  David Maier,et al.  Capturing episodes: may the frame be with you , 2012, DEBS.

[5]  Haofen Wang,et al.  Towards Effective Event Detection, Tracking and Summarization on Microblog Data , 2011, WAIM.

[6]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[7]  Carlos J. Martín-Dancausa,et al.  Mining Newsworthy Topics from Social Media , 2013, SMA@BCS-SGAI.

[8]  Igor G. Zurbenko,et al.  Kolmogorov–Zurbenko filters , 2010 .

[9]  Akira Fukuda,et al.  Hot topic detection in local areas using Twitter and Wikipedia , 2012, ARCS 2012.

[10]  Igor Brigadir,et al.  Event Detection in Twitter using Aggressive Filtering and Hierarchical Tweet Clustering , 2014, SNOW-DC@WWW.

[11]  Hans-Peter Kriegel,et al.  Discovering global and local bursts in a stream of news , 2012, SAC '12.

[12]  Ko Fujimura,et al.  Improving tweet stream classification by detecting changes in word probability , 2012, SIGIR '12.

[13]  Marc Cheong,et al.  Integrating web-based intelligence retrieval and decision-making from the twitter trends knowledge base , 2009, CIKM-SWSM.

[14]  Kazutoshi Sumiya,et al.  Measuring geographical regularities of crowd behaviors for Twitter-based geo-social event detection , 2010, LBSN '10.

[15]  Kamalakar Karlapalem,et al.  ET: events from tweets , 2013, WWW.

[16]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[17]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[18]  Gerhard Weikum,et al.  See what's enBlogue: real-time emergent topic identification in social media , 2012, EDBT '12.

[19]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[20]  Yannis Stavrakas,et al.  Degeneracy-Based Real-Time Sub-Event Detection in Twitter Stream , 2015, ICWSM.

[21]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[22]  Maximilian Walther,et al.  Geo-spatial Event Detection in the Twitter Stream , 2013, ECIR.

[23]  M. Osborne,et al.  Bieber no more : First Story Detection using Twitter and Wikipedia , 2012 .

[24]  Chris Hankin,et al.  The early bird catches the term: combining twitter and news data for event detection and situational awareness , 2015, Journal of Biomedical Semantics.

[25]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[26]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[27]  Aron Culotta,et al.  Towards detecting influenza epidemics by analyzing Twitter messages , 2010, SOMA '10.

[28]  Jennifer Widom,et al.  The CQL continuous query language: semantic foundations and query execution , 2006, The VLDB Journal.

[29]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[30]  Lei Chen,et al.  Event detection over twitter social media streams , 2013, The VLDB Journal.

[31]  Kalina Bontcheva,et al.  Making sense of social media streams through semantics: A survey , 2014, Semantic Web.

[32]  Karen Spärck Jones A statistical interpretation of term specificity and its application in retrieval , 2021, J. Documentation.

[33]  Jugal K. Kalita,et al.  Streaming trend detection in Twitter , 2013, Int. J. Web Based Communities.

[34]  Hsin-Chang Yang,et al.  A Novel Approach for Event Detection by Mining Spatio-temporal Information on Microblogs , 2011, 2011 International Conference on Advances in Social Networks Analysis and Mining.

[35]  Edi Winarko,et al.  Event detection in social media: A survey , 2013, International Conference on ICT for Smart Society.

[36]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[37]  TufteKristin,et al.  No pane, no gain , 2005 .

[38]  Qiang Chen,et al.  Aurora : a new model and architecture for data stream management ) , 2006 .

[39]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[40]  Ammatzia Peled,et al.  Identifying and Tracking Major Events Using Geo-Social Networks , 2013 .

[41]  Amina Madani,et al.  What’s Happening: A Survey of Tweets Event Detection , 2014, ICC 2014.

[42]  Mor Naaman,et al.  Is it really about me?: message content in social awareness streams , 2010, CSCW '10.

[43]  Theodore Johnson,et al.  Out-of-order processing: a new architecture for high-performance stream systems , 2008, Proc. VLDB Endow..

[44]  Michael Grossniklaus,et al.  Evaluation Measures for Event Detection Techniques on Twitter Data Streams , 2015, BICOD.

[45]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.

[46]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[47]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[48]  Roberto Frias,et al.  Twitter event detection: combining wavelet analysis and topic inference summarization , 2011 .

[49]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[50]  Yiannis Kompatsiaris,et al.  Sensing Trending Topics in Twitter , 2013, IEEE Transactions on Multimedia.

[51]  Hanan Samet,et al.  TwitterStand: news in tweets , 2009, GIS.

[52]  Michael Grossniklaus,et al.  Run-Time and Task-Based Performance of Event Detection Techniques for Twitter , 2015, CAiSE.

[53]  David Maier,et al.  No pane, no gain: efficient evaluation of sliding-window aggregates over data streams , 2005, SGMD.

[54]  Jignesh M. Patel,et al.  Storm@twitter , 2014, SIGMOD Conference.

[55]  Michael Grossniklaus,et al.  Event Identification and Tracking in Social Media Streaming Data , 2014, EDBT/ICDT Workshops.