Event detection with common user interests

In this paper, we aim at detecting events of common user interests from huge volume of user-generated content. The degree of interest from common users in an event is evidenced by a significant surge of event-related queries issued to search for documents (e.g., news articles, blog posts) relevant to the event. Taking the stream of queries from users and the stream of documents as input, our proposed framework seamlessly integrates the two streams into a single stream of query profiles. A query profile is a set of documents matching a query at a given time. With the single stream of query profiles, the well-studied techniques in event detection (e.g., incremental clustering) could be easily applied. In our experiments using real data collected from Blog and News search engines respectively, the proposed technique achieved very high event detection accuracy.

[1]  Philip S. Yu,et al.  Parameter Free Bursty Events Detection in Text Streams , 2005, VLDB.

[2]  Ee-Peng Lim,et al.  Analyzing feature trajectories for event detection , 2007, SIGIR.

[3]  Yi Zhang,et al.  Novelty and redundancy detection in adaptive filtering , 2002, SIGIR '02.

[4]  Helena Ahonen-Myka,et al.  Simple Semantics in Topic Detection and Tracking , 2004, Information Retrieval.

[5]  Mehran Sahami,et al.  A web-based kernel function for measuring the similarity of short text snippets , 2006, WWW '06.

[6]  Yiming Yang,et al.  Learning approaches for detecting and tracking news events , 1999, IEEE Intell. Syst..

[7]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[8]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.

[9]  Gilad Mishne,et al.  A Study of Blog Search , 2006, ECIR.

[10]  Philip S. Yu,et al.  Continuous keyword search on multiple text streams , 2006, CIKM '06.

[11]  Donna K. Harman,et al.  Novelty Detection: The TREC Experience , 2005, HLT.

[12]  Yiming Yang,et al.  A study of retrospective and on-line event detection , 1998, SIGIR '98.

[13]  Hai Leong Chieu,et al.  Query based event extraction along a timeline , 2004, SIGIR '04.

[14]  Kuo Zhang,et al.  New event detection based on indexing-tree and named entity , 2007, SIGIR.

[15]  Philip S. Yu,et al.  Resource-adaptive real-time new event detection , 2007, SIGMOD '07.

[16]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[17]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[18]  Yiming Yang,et al.  Topic-conditioned novelty detection , 2002, KDD.

[19]  Tie-Yan Liu,et al.  Time-dependent semantic similarity measure of queries using historical click-through data , 2006, WWW '06.

[20]  Thorsten Brants,et al.  A System for new event detection , 2003, SIGIR.

[21]  Philip S. Yu,et al.  Time-dependent event hierarchy construction , 2007, KDD '07.

[22]  Dimitrios Gunopulos,et al.  Identifying similarities, periodicities and bursts for online search queries , 2004, SIGMOD '04.

[23]  Ee-Peng Lim,et al.  Searching blogs and news: a study on popular queries , 2008, SIGIR '08.

[24]  Fernando Diaz,et al.  Using temporal profiles of queries for precision prediction , 2004, SIGIR '04.

[25]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR.

[26]  Tie-Yan Liu,et al.  Event detection from evolution of click-through data , 2006, KDD '06.