Detecting Priming News Events

We study a problem of detecting priming events based on a time series index and an evolving document stream. We define a priming event as an event which triggers abnormal movements of the time series index, i.e., the Iraq war with respect to the president approval index of President Bush. Existing solutions either focus on organizing coherent keywords from a document stream into events or identifying correlated movements between keyword frequency trajectories and the time series index. In this paper, we tackle the problem in two major steps. (1) We identify the elements that form a priming event. The element identified is called influential topic which consists of a set of coherent keywords. And we extract them by looking at the correlation between keyword trajectories and the interested time series index at a global level. (2) We extract priming events by detecting and organizing the bursty influential topics at a micro level. We evaluate our algorithms on a real-world dataset and the result confirms that our method is able to discover the priming events effectively.

[1]  Robert V. Brill,et al.  Applied Statistics and Probability for Engineers , 2004, Technometrics.

[2]  Douglas C. Montgomery,et al.  Applied Statistics and Probability for Engineers, Third edition , 1994 .

[3]  Thomas L. Griffiths,et al.  Integrating Topics and Syntax , 2004, NIPS.

[4]  Hongjun Lu,et al.  The Predicting Power of Textual Information on Financial Markets , 2005, IEEE Intell. Informatics Bull..

[5]  Hector Garcia-Molina,et al.  Overview of multidatabase transaction management , 2005, The VLDB Journal.

[6]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[7]  Jon M. Kleinberg,et al.  Bursty and Hierarchical Structure in Streams , 2002, Data Mining and Knowledge Discovery.

[8]  Philip S. Yu,et al.  Parameter Free Bursty Events Detection in Text Streams , 2005, VLDB.

[9]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[10]  Richard Sproat,et al.  Mining correlated bursty topic patterns from coordinated text streams , 2007, KDD '07.

[11]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[12]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[13]  Hong Cheng,et al.  Stock risk mining by news , 2010, ADC.

[14]  John Brehm,et al.  History, heterogeneity, and presidential approval: a modified ARCH approach , 2002 .

[15]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[16]  Ee-Peng Lim,et al.  Analyzing feature trajectories for event detection , 2007, SIGIR.

[17]  Shlomo Geva,et al.  Can the Content of Public News Be Used to Forecast Abnormal Stock Market Behaviour? , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[18]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[19]  Philip S. Yu,et al.  Time-dependent event hierarchy construction , 2007, KDD '07.

[20]  Jeffrey Xu Yu,et al.  Integrating Multiple Data Sources for Stock Prediction , 2008, WISE.

[21]  James Allan,et al.  Topic detection and tracking: event-based information organization , 2002 .

[22]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[23]  Douglas L. Kriner,et al.  Partisan Dynamics and the Volatility of Presidential Approval , 2009, British Journal of Political Science.

[24]  Tansel Özyer,et al.  Clustering by Integrating Multi-objective Optimization with Weighted K-Means and Validity Analysis , 2006, IDEAL.