A Temporal Topic Model for Noisy Mediums

Social media and online news content are increasing rapidly. The goal of this work is to identify the topics associated with this content and understand the changing dynamics of these topics over time. We propose Topic Flow Model (TFM), a graph theoretic temporal topic model that identifies topics as they emerge, and tracks them through time as they persist, diminish, and re-emerge. TFM identifies topic words by capturing the changing relationship strength of words over time, and offers solutions for dealing with flood words, i.e., domain specific words that pollute topics. An extensive empirical analysis of TFM on Twitter data, newspaper articles, and synthetic data shows that the topic accuracy and SNR of meaningful topic words are better than the existing state.

[1]  Michael W. Berry,et al.  Document clustering using nonnegative matrix factorization , 2006, Inf. Process. Manag..

[2]  Timothy W. Finin,et al.  Modeling the Evolution of Climate Change Assessment Research Using Dynamic Topic Models and Cross-Domain Divergence Maps , 2017, AAAI Spring Symposia.

[3]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.

[4]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[6]  Xiaohui Yan,et al.  Learning Topics in Short Texts by Non-negative Matrix Factorization on Term Correlation Matrix , 2013, SDM.

[7]  Filippo Menczer,et al.  Online Human-Bot Interactions: Detection, Estimation, and Characterization , 2017, ICWSM.

[8]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[9]  John D. Lafferty,et al.  Visualizing Topics with Multi-Word Expressions , 2009, 0907.1013.

[10]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[11]  Vikas Sindhwani,et al.  Emerging topic detection using dictionary learning , 2011, CIKM '11.

[12]  Luciano da Fontoura Costa,et al.  Topic segmentation via community detection in complex networks , 2015, Chaos.

[13]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[14]  Jun Zhu,et al.  Scaling up Dynamic Topic Models , 2016, WWW.

[15]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[16]  Mario Cataldi,et al.  Emerging topic detection on Twitter based on temporal and social terms evaluation , 2010, MDMKDD '10.