Relational Topic Models for Document Networks

We develop the relational topic model (RTM), a model of documents and the links between them. For each pair of documents, the RTM models their link as a binary random variable that is conditioned on their contents. The model can be used to summarize a network of documents, predict links between them, and predict words within them. We derive efficient inference and learning algorithms based on variational methods and evaluate the predictive performance of the RTM for large networks of scientific abstracts and web documents.

[1]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[2]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[3]  Steffen Bickel,et al.  Unsupervised prediction of citation influences , 2007, ICML '07.

[4]  Michal Rosen-Zvi,et al.  Latent Topic Models for Hypertext , 2008, UAI.

[5]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[6]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[7]  Edoardo M. Airoldi,et al.  Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[8]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[9]  Mark Newman,et al.  The structure and function of networks , 2002 .

[10]  Ben Taskar,et al.  Link Prediction in Relational Data , 2003, NIPS.

[11]  Ramesh Nallapati,et al.  Link-PLSA-LDA: A New Unsupervised Model for Topics and Influence of Blogs , 2021, ICWSM.

[12]  Jon D. McAuliffe,et al.  Variational Inference for Large-Scale Models of Discrete Choice , 2007, 0712.2526.

[13]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[14]  Thomas L. Griffiths,et al.  Discovering Latent Classes in Relational Data , 2004 .

[15]  Peter D. Hoff,et al.  Latent Space Approaches to Social Network Analysis , 2002 .

[16]  Ramesh Nallapati,et al.  Joint latent topic models for text and citations , 2008, KDD.

[17]  David A. Cohn,et al.  The Missing Link - A Probabilistic Model of Document Content and Hypertext Connectivity , 2000, NIPS.

[18]  Janne Sinkkonen,et al.  Component models for large networks , 2008, 0803.1628.

[19]  S. Wasserman,et al.  Logit models and logistic regressions for social networks: I. An introduction to Markov graphs andp , 1996 .

[20]  Chris H Wiggins,et al.  Bayesian approach to network modularity. , 2007, Physical review letters.

[21]  Andrew McCallum,et al.  Group and topic discovery from relations and text , 2005, LinkKDD '05.

[22]  Ben Taskar,et al.  Learning Probabilistic Models of Relational Structure , 2001, ICML.

[23]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[24]  Andrew McCallum,et al.  Topic and Role Discovery in Social Networks , 2005, IJCAI.