Overlapping Target Event and Story Line Detection of Online Newspaper Articles

Event detection from text data is an active area of research. While the emphasis has been on event identification and labeling using a single data source, this work considers event and story line detection when using a large number of data sources. In this setting, it is natural for different events in the same domain, e.g. violence, sports, politics, to occur at the same time and for different story lines about the same event to emerge. To capture events in this setting, we propose an algorithm that detects events and story lines about events for a target domain. Our algorithm leverages a multi-relational sentence level semantic graph and well known graph properties to identify overlapping events and story lines within the events. We evaluate our approach on two large data sets containing millions of news articles from a large number of sources. Our empirical analysis shows that our approach improves the detection precision and recall by 10% to 25%, while providing complete event summaries.

[1]  Charu C. Aggarwal,et al.  Event Detection in Social Streams , 2012, SDM.

[2]  Ke Wang,et al.  TopicSketch: Real-Time Bursty Topic Detection from Twitter , 2013, 2013 IEEE 13th International Conference on Data Mining.

[3]  Dimitrios Gunopulos,et al.  On burstiness-aware search for document sequences , 2009, KDD.

[4]  Philip S. Yu,et al.  Parameter Free Bursty Events Detection in Text Streams , 2005, VLDB.

[5]  Hans Uszkoreit,et al.  Automatic Event and Relation Detection with Seeds of Varying Complexity , 2006 .

[6]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR Forum.

[7]  Naren Ramakrishnan,et al.  Planned Protest Modeling in News and Social Media , 2015, AAAI.

[8]  Matthew Hurst,et al.  Event Detection and Tracking in Social Streams , 2009, ICWSM.

[9]  M E J Newman,et al.  Community structure in social and biological networks , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Richard Sproat,et al.  Mining correlated bursty topic patterns from coordinated text streams , 2007, KDD '07.

[11]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[12]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[13]  Rui Li,et al.  TEDAS: A Twitter-based Event Detection and Analysis System , 2012, 2012 IEEE 28th International Conference on Data Engineering.

[14]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[15]  Deepayan Chakrabarti,et al.  Event Summarization Using Tweets , 2011, ICWSM.

[16]  Bo Zhao,et al.  PET: a statistical model for popular events tracking in social communities , 2010, KDD.

[17]  Oren Etzioni,et al.  Open domain event extraction from twitter , 2012, KDD.

[18]  Jeffrey Nichols,et al.  Summarizing sporting events using twitter , 2012, IUI '12.

[19]  Freddy Chong Tat Chua,et al.  Automatic Summarization of Events from Social Media , 2013, ICWSM.

[20]  Aravind Srinivasan,et al.  'Beating the news' with EMBERS: forecasting civil unrest using open source indicators , 2014, KDD.

[21]  Thorsten Brants,et al.  A System for new event detection , 2003, SIGIR.

[22]  Rada Mihalcea,et al.  A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[23]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[24]  Yoko Nishihara,et al.  Event Extraction and Visualization for Obtaining Personal Experiences from Blogs , 2009, HCI.

[25]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.

[26]  Christos Faloutsos,et al.  Monitoring Network Evolution using MDL , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[27]  Dawei Wang,et al.  A Hierarchical Pattern Learning Framework for Forecasting Extreme Weather Events , 2015, 2015 IEEE International Conference on Data Mining.

[28]  Daniel B. Neill,et al.  Non-parametric scan statistics for event detection and forecasting in heterogeneous social media graphs , 2014, KDD.

[29]  Dimitrios Gunopulos,et al.  On The Spatiotemporal Burstiness of Terms , 2012, Proc. VLDB Endow..

[30]  J TsotrasVassilis,et al.  On the spatiotemporal burstiness of terms , 2012, VLDB 2012.

[31]  Jaideep Srivastava,et al.  Event detection from time series data , 1999, KDD '99.

[32]  Bu-Sung Lee,et al.  Event Detection in Twitter , 2011, ICWSM.

[33]  Hua Li,et al.  Document Summarization Using Conditional Random Fields , 2007, IJCAI.

[34]  Chenliang Li,et al.  Twevent: segment-based event detection from tweets , 2012, CIKM.

[35]  Jiawei Han,et al.  Mining Multi-aspect Reflection of News Events in Twitter: Discovery, Linking and Presentation , 2015, 2015 IEEE International Conference on Data Mining.