TwitterNews+: A Framework for Real Time Event Detection from the Twitter Data Stream

In recent years, substantial research efforts have gone into investigating different approaches to the detection of events in real time from the Twitter data stream. Most of these approaches, however, suffer from a high computational cost and are not evaluated using a publicly available corpus, thus making it difficult to properly compare them. In this paper, we propose a scalable event detection system, TwitterNews+, to detect and track newsworthy events in real time. TwitterNews+ uses a novel approach to cluster event related tweets from Twitter with a significantly lower computational cost compared to the existing state-of-the-art approaches. Finally, we evaluate the effectiveness of TwitterNews+ using a publicly available corpus and its associated ground truth data set of newsworthy events. The result of the evaluation shows a significant improvement, in terms of recall and precision, over the baselines we have used.

[1]  Zi Huang,et al.  What are Popular: Exploring Twitter Features for Event Detection, Tracking and Visualization , 2015, ACM Multimedia.

[2]  Gerhard Weikum,et al.  EnBlogue: emergent topic detection in web 2.0 streams , 2011, SIGMOD '11.

[3]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[4]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[5]  Yiannis Kompatsiaris,et al.  A soft frequent pattern mining approach for textual topic detection , 2014, WIMS '14.

[6]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[7]  Miles Osborne,et al.  Streaming First Story Detection with application to Twitter , 2010, NAACL.

[8]  Yiannis Kompatsiaris,et al.  Sensing Trending Topics in Twitter , 2013, IEEE Transactions on Multimedia.

[9]  Mehmet A. Orgun,et al.  TwitterNews: Real time event detection from the Twitter data stream , 2016, PeerJ Prepr..

[10]  Kalina Bontcheva,et al.  Twitter Part-of-Speech Tagging for All: Overcoming Sparse and Noisy Data , 2013, RANLP.

[11]  Nick Koudas,et al.  TwitterMonitor: trend detection over the twitter stream , 2010, SIGMOD Conference.

[12]  KhreichWael,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, CI 2015.

[13]  Jianxin Li,et al.  Bursty event detection from microblog: a distributed and incremental approach , 2016, Concurr. Comput. Pract. Exp..

[14]  Barbara Poblete,et al.  On-line relevant anomaly detection in the Twitter stream: an efficient bursty keyword detection model , 2013, ODD '13.

[15]  Joemon M. Jose,et al.  Building a large-scale corpus for evaluating event detection on twitter , 2013, CIKM.

[16]  Michael S. Bernstein,et al.  Twitinfo: aggregating and visualizing microblogs for event exploration , 2011, CHI.

[17]  Wael Khreich,et al.  A Survey of Techniques for Event Detection in Twitter , 2015, Comput. Intell..

[18]  Marco Morana,et al.  A framework for real-time Twitter data analysis , 2016, Comput. Commun..

[19]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[20]  Ciro Cattuto,et al.  Dynamical classes of collective attention in twitter , 2011, WWW.

[21]  Paola Velardi,et al.  Efficient temporal mining of micro-blog texts and its application to event discovery , 2015, Data Mining and Knowledge Discovery.

[22]  Joemon M. Jose,et al.  Real-Time Entity-Based Event Detection for Twitter , 2015, CLEF.

[23]  Magnus Sahlgren,et al.  An Introduction to Random Indexing , 2005 .

[24]  Chen Lin,et al.  CLEar: A Real-time Online Observatory for Bursty and Viral Events , 2014, Proc. VLDB Endow..