ATT: analyzing temporal dynamics of topics and authors in social media

Understanding Topical trends and user roles in topic evolution is an important challenge in the field of information retrieval. In this contribution, we present a novel model for analyzing evolution of user's interests with respect to produced content over time. Our approach Author-Topic-Time model (ATT) addresses this problem by means of Bayesian modeling of relations between authors, latent topics and temporal information. We extend state of the art Latent Dirichlet Allocation (LDA) topic model to incorporate the author and timestamp information for capturing changes in user interest over time with respect to evolving latent topics. We present results of application of the model to the 9 years of scientific publication datasets from CiteSeer showing improved semantically cohesive topic detection and capturing shift in authors interest in relation to topic evolution. We also discuss opportunities of model use in novel mining and recommendation scenarios.

[1]  Chong Wang,et al.  Continuous Time Dynamic Topic Models , 2008, UAI.

[2]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[3]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[4]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[5]  Andrew McCallum,et al.  Group and topic discovery from relations and text , 2005, LinkKDD '05.

[6]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[7]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[8]  Sebastian Thrun,et al.  Text Classification from Labeled and Unlabeled Documents using EM , 2000, Machine Learning.

[9]  Andrew McCallum,et al.  Topics over time: a non-Markov continuous-time model of topical trends , 2006, KDD '06.

[10]  Ramanathan V. Guha,et al.  Information diffusion through blogspace , 2004, WWW '04.

[11]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[13]  Ramanathan V. Guha,et al.  The predictive power of online chatter , 2005, KDD '05.

[14]  Steffen Staab,et al.  ATTention: Understanding Authors and Topics in Context of Temporal Evolution , 2011, ECIR.

[15]  Victor Cheng,et al.  Linked Topic and Interest Model for Web Forums , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[16]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.