Dynamic and Static Topic Model for Analyzing Time-Series Document Collections

For extracting meaningful topics from texts, their structures should be considered properly. In this paper, we aim to analyze structured time-series documents such as a collection of news articles and a series of scientific papers, wherein topics evolve along time depending on multiple topics in the past and are also related to each other at each time. To this end, we propose a dynamic and static topic model, which simultaneously considers the dynamic structures of the temporal topic evolution and the static structures of the topic hierarchy at each time. We show the results of experiments on collections of scientific papers, in which the proposed method outperformed conventional models. Moreover, we show an example of extracted topic structures, which we found helpful for analyzing research activities.

[1]  Yasushi Sakurai,et al.  Online multiscale dynamic topic models , 2010, KDD.

[2]  W. Eric L. Grimson,et al.  Construction of Dependent Dirichlet Processes based on Poisson Processes , 2010, NIPS.

[3]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[4]  T. Minka Estimating a Dirichlet distribution , 2012 .

[5]  Yee Whye Teh,et al.  Poisson Random Fields for Dynamic Feature Models , 2016, J. Mach. Learn. Res..

[6]  Wei Li,et al.  Pachinko allocation: DAG-structured mixture models of topic correlations , 2006, ICML.

[7]  Eric P. Xing,et al.  Timeline: A Dynamic Hierarchical Dirichlet Process Model for Recovering Birth/Death and Evolution of Topics in Text Stream , 2010, UAI.

[8]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[10]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[11]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[12]  Naoya Takeishi,et al.  Recent Developments in Aerial Robotics: A Survey and Prototypes Overview , 2017, ArXiv.

[13]  Yu Huang,et al.  Discovering hierarchical topic evolution in time‐stamped documents , 2016, J. Assoc. Inf. Sci. Technol..

[14]  Huidong Jin,et al.  A segmented topic model based on the two-parameter Poisson-Dirichlet process , 2010, Machine Learning.

[15]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.