LPTA: A Probabilistic Model for Latent Periodic Topic Analysis

This paper studies the problem of latent periodic topic analysis from time stamped documents. The examples of time stamped documents include news articles, sales records, financial reports, TV programs, and more recently, posts from social media websites such as Flickr, Twitter, and Face book. Different from detecting periodic patterns in traditional time series database, we discover the topics of coherent semantics and periodic characteristics where a topic is represented by a distribution of words. We propose a model called LPTA (Latent Periodic Topic Analysis) that exploits the periodicity of the terms as well as term co-occurrences. To show the effectiveness of our model, we collect several representative datasets including Seminar, DBLP and Flickr. The results show that our model can discover the latent periodic topics effectively and leverage the information from both text and time well.

[1]  Bin Wang,et al.  A probabilistic model for retrospective news event detection , 2005, SIGIR '05.

[2]  Chao Liu,et al.  A probabilistic approach to spatiotemporal theme pattern mining on weblogs , 2006, WWW '06.

[3]  Richard Sproat,et al.  Mining correlated bursty topic patterns from coordinated text streams , 2007, KDD '07.

[4]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[5]  Hila Becker,et al.  Learning similarity metrics for event identification in social media , 2010, WSDM '10.

[6]  Arnold W. M. Smeulders,et al.  Periodic event detection and recognition in video , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[7]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[8]  Yasushi Sakurai,et al.  Online multiscale dynamic topic models , 2010, KDD.

[9]  Philip S. Yu,et al.  Time-dependent event hierarchy construction , 2007, KDD '07.

[10]  James Allan,et al.  On-Line New Event Detection and Tracking , 1998, SIGIR.

[11]  Tanya Y. Berger-Wolf,et al.  Periodic subgraph mining in dynamic networks , 2010, Knowledge and Information Systems.

[12]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[13]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[14]  Jure Leskovec,et al.  Patterns of temporal variation in online media , 2011, WSDM '11.

[15]  D. Stott Parker,et al.  Topic dynamics: an alternative model of bursts in streams of topics , 2010, KDD.

[16]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[17]  Tanya Y. Berger-Wolf,et al.  Mining Periodic Behavior in Dynamic Social Networks , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[18]  Jianwen Zhang,et al.  Evolutionary hierarchical dirichlet processes for multiple correlated time-varying corpora , 2010, KDD.

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[21]  Ryoji Kataoka,et al.  Detecting periodic changes in search intentions in a search engine , 2010, CIKM '10.

[22]  Philip S. Yu,et al.  Mining Asynchronous Periodic Patterns in Time Series Data , 2003, IEEE Trans. Knowl. Data Eng..

[23]  Jiawei Han,et al.  Mining periodic behaviors for moving objects , 2010, KDD.

[24]  H. Kuhn The Hungarian method for the assignment problem , 1955 .

[25]  Jure Leskovec,et al.  Meme-tracking and the dynamics of the news cycle , 2009, KDD.

[26]  Ling Chen,et al.  Event detection from flickr data through wavelet-based spatial analysis , 2009, CIKM.

[27]  Philip S. Yu,et al.  On Periodicity Detection and Structural Periodic Similarity , 2005, SDM.

[28]  Jiawei Han,et al.  Modeling hidden topics on document manifold , 2008, CIKM '08.

[29]  Walid G. Aref,et al.  Periodicity detection in time series databases , 2005, IEEE Transactions on Knowledge and Data Engineering.

[30]  Philip S. Yu,et al.  Mining Surprising Periodic Patterns , 2004, Data Mining and Knowledge Discovery.

[31]  Dimitrios Gunopulos,et al.  Identifying similarities, periodicities and bursts for online search queries , 2004, SIGMOD '04.