Real Time Event Detection in Twitter

Event detection has been an important task for a long time. When it comes to Twitter, new problems are presented. Twitter data is a huge temporal data flow with much noise and various kinds of topics. Traditional sophisticated methods with a high computational complexity aren't designed to handle such data flow efficiently. In this paper, we propose a mixture Gaussian model for bursty word extraction in Twitter and then employ a novel time-dependent HDP model for new topic detection. Our model can grasp new events, the location and the time an event becomes bursty promptly and accurately. Experiments show the effectiveness of our model in real time event detection in Twitter.

[1]  Ee-Peng Lim,et al.  Finding Bursty Topics from Microblogs , 2012, ACL.

[2]  Marko Grobelnik,et al.  Event Detection in Twitter With an Event Knowledge Base , 2015 .

[3]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[4]  David B. Dunson,et al.  The dynamic hierarchical Dirichlet process , 2008, ICML '08.

[5]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[6]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.

[7]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[8]  Haixun Wang,et al.  Tracking and Connecting Topics via Incremental Hierarchical Dirichlet Processes , 2011, 2011 IEEE 11th International Conference on Data Mining.

[9]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[10]  M. Kulldorff,et al.  Multivariate scan statistics for disease surveillance , 2007, Statistics in medicine.

[11]  D. Blackwell,et al.  Ferguson Distributions Via Polya Urn Schemes , 1973 .

[12]  Deepayan Chakrabarti,et al.  Evolutionary clustering , 2006, KDD '06.

[13]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[14]  D. Donoho,et al.  Higher criticism for detecting sparse heterogeneous mixtures , 2004, math/0410072.

[15]  Philip S. Yu,et al.  Parameter Free Bursty Events Detection in Text Streams , 2005, VLDB.

[16]  Ee-Peng Lim,et al.  Analyzing feature trajectories for event detection , 2007, SIGIR.

[17]  Philip S. Yu,et al.  Dirichlet Process Based Evolutionary Clustering , 2008, 2008 Eighth IEEE International Conference on Data Mining.