A Model for Anticipatory Event Detection

Event detection is a very important area of research that discovers new events reported in a stream of text documents. Previous research in event detection has largely focused on finding the first story and tracking the events of a specific topic. A topic is simply a set of related events defined by user supplied keywords with no associated semantics and little domain knowledge. We therefore introduce the Anticipatory Event Detection (AED) problem: given some user preferred event transition in a topic, detect the occurence of the transition for the stream of news covering the topic. We confine the events to come from the same application domain, in particular, mergers and acquisitions. Our experiments showed that classical cosine similarity method fails for the AED task, whereas our conceptual model-based approach, through the use of domain knowledge and named entity type assignments, seems promising. We show experimentally that an AED voting classifier operating on a vector representation with name entities replaced by types performed AED successfully.

[1]  Katharina Morik,et al.  Combining Statistical Learning with a Knowledge-Based Approach - A Case Study in Intensive Care Monitoring , 1999, ICML.

[2]  Ramesh Nallapati,et al.  Event threading within news topics , 2004, CIKM '04.

[3]  Shai Ben-David,et al.  Detecting Change in Data Streams , 2004, VLDB.

[4]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[5]  Thorsten Brants,et al.  A System for new event detection , 2003, SIGIR.

[6]  Joe Carthy,et al.  Combining semantic and syntactic document classifiers to improve first story detection , 2001, SIGIR '01.

[7]  James Allan,et al.  Topic Detection and Tracking , 2002, The Information Retrieval Series.

[8]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[9]  Filippo Menczer,et al.  Dynamic extraction topic descriptors and discriminators: towards automatic context-based topic search , 2004, CIKM '04.

[10]  Richard M. Schwartz,et al.  Topic tracking for radio, TV broadcast, and newswire , 1999, EUROSPEECH.

[11]  James Allan,et al.  Text classification and named entities for new event detection , 2004, SIGIR '04.

[12]  James Allan,et al.  Retrieval and novelty detection at the sentence level , 2003, SIGIR.

[13]  Yiming Yang,et al.  Topic-conditioned novelty detection , 2002, KDD.

[14]  Qi He,et al.  Anticipatory Event Detection via Sentence Classification , 2006, 2006 IEEE International Conference on Systems, Man and Cybernetics.

[15]  Yoshimi Suzuki,et al.  Event tracking based on domain dependency , 2000, SIGIR '00.

[16]  Bin Wang,et al.  A probabilistic model for retrospective news event detection , 2005, SIGIR '05.

[17]  Richard M. Schwartz,et al.  An Algorithm that Learns What's in a Name , 1999, Machine Learning.

[18]  James Allan,et al.  First story detection in TDT is hard , 2000, CIKM '00.