Feature-Rich Segment-Based News Event Detection on Twitter

Event detection on Twitter is an important and challenging research topic. On the one hand, Twitter provides first-hand information and fast broadcasting. On the other, challenges include short and noisy content, big volume data and fast-changing topics. Dominant approaches for Twitter event detection model events by clustering tweets, words or segments, while segments have been proven to be advantageous over both words and tweets in news event detection. We study segment-based news event detection, for which existing heuristic-based methods suffer from low recall. We propose feature-based event filtering to address this issue. Our filter incorporate a rich family of features that are empirically proven to be valuable. Experimental results show that our event detection system outperforms the state-of-theart baseline with doubled recall and increased precision.

[1]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[2]  Bu-Sung Lee,et al.  TwiNER: named entity recognition in targeted twitter stream , 2012, SIGIR '12.

[3]  Ee-Peng Lim,et al.  Finding Bursty Topics from Microblogs , 2012, ACL.

[4]  Hila Becker,et al.  Identifying content for planned events across social media sites , 2012, WSDM '12.

[5]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[6]  Vikas Sindhwani,et al.  Emerging topic detection using dictionary learning , 2011, CIKM '11.

[7]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[8]  Jonghun Park,et al.  Bursty event detection from text streams for disaster management , 2012, WWW.

[9]  Hila Becker,et al.  Beyond Trending Topics: Real-World Event Identification on Twitter , 2011, ICWSM.

[10]  Ana-Maria Popescu,et al.  Extracting events and event descriptions from Twitter , 2011, WWW.

[11]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[12]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.

[13]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.

[14]  Elad Yom-Tov,et al.  Location and timeliness of information sources during news events , 2011, SIGIR.

[15]  Joakim Nivre,et al.  Transition-based Dependency Parsing with Rich Non-local Features , 2011, ACL.

[16]  Dimitrios Gunopulos,et al.  Searching for events in the blogosphere , 2009, WWW '09.

[17]  Philip S. Yu,et al.  Parameter Free Bursty Events Detection in Text Streams , 2005, VLDB.

[18]  Kevin Knight,et al.  11,001 New Features for Statistical Machine Translation , 2009, NAACL.

[19]  Gautam Shroff,et al.  Catching the Long-Tail: Extracting Local News Events from Twitter , 2012, ICWSM.

[20]  Hermann Hellwagner,et al.  Automatic sub-event detection in emergency management using social media , 2012, WWW.

[21]  Regina Barzilay,et al.  Event Discovery in Social Media Feeds , 2011, ACL.

[22]  Hanan Samet,et al.  Identification of live news events using Twitter , 2011, LBSN '11.