论文信息 - Hierarchical relational models for document networks

Hierarchical relational models for document networks

We develop the relational topic model (RTM), a hierarchical model of both network structure and node attributes. We focus on document networks, where the attributes of each document are its words, that is, discrete observations taken from a fixed vocabulary. For each pair of documents, the RTM models their link as a binary random variable that is conditioned on their contents. The model can be used to summarize a network of documents, predict links between them, and predict words within them. We derive efficient inference and estimation algorithms based on variational methods that take advantage of sparsity and scale with the number of links. We evaluate the predictive performance of the RTM for large networks of scientific abstracts, web documents, and geographically tagged news.

Jonathan D. Chang | D. Blei

[1] Ramesh Nallapati,et al. Link-PLSA-LDA: A New Unsupervised Model for Topics and Influence of Blogs , 2021, ICWSM.

[2] David B. Dunson,et al. Probabilistic topic models , 2011, KDD '11 Tutorials.

[3] I. C. Gormley,et al. A grade of membership model for rank data , 2009 .

[4] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[5] Jordan L. Boyd-Graber,et al. Syntactic Topic Models , 2008, NIPS.

[6] William W. Cohen,et al. Joint latent topic models for text and citations , 2008, KDD.

[7] Michal Rosen-Zvi,et al. Latent Topic Models for Hypertext , 2008, UAI.

[8] Deng Cai,et al. Topic modeling with network regularization , 2008, WWW.

[9] Janne Sinkkonen,et al. Component models for large networks , 2008, 0803.1628.

[10] Jon D. McAuliffe,et al. Variational Inference for Large-Scale Models of Discrete Choice , 2007, 0712.2526.

[11] S. Fienberg,et al. DESCRIBING DISABILITY THROUGH INDIVIDUAL-LEVEL MIXTURE MODELS FOR MULTIVARIATE BINARY DATA. , 2007, The annals of applied statistics.

[12] David M. Blei,et al. Supervised Topic Models , 2007, NIPS.

[13] Chris H Wiggins,et al. Bayesian approach to network modularity. , 2007, Physical review letters.

[14] Steffen Bickel,et al. Unsupervised prediction of citation influences , 2007, ICML '07.

[15] Edoardo M. Airoldi,et al. Mixed Membership Stochastic Blockmodels , 2007, NIPS.

[16] Danielle S. McNamara,et al. Handbook of latent semantic analysis , 2007 .

[17] Michael I. Jordan,et al. Hierarchical Dirichlet Processes , 2006 .

[18] Hans-Peter Kriegel,et al. Infinite Hidden Relational Models , 2006, UAI.

[19] Michael I. Jordan,et al. Variational inference for Dirichlet process mixtures , 2006 .

[20] Dunja Mladenic,et al. Proceedings of the 3rd international workshop on Link discovery , 2005, KDD 2005.

[21] Andrew McCallum,et al. Group and topic discovery from relations and text , 2005, LinkKDD '05.

[22] Andrew McCallum,et al. Topic and Role Discovery in Social Networks , 2005, IJCAI.

[23] Pietro Perona,et al. A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[24] Thomas L. Griffiths,et al. Discovering Latent Classes in Relational Data , 2004 .

[25] J. Lafferty,et al. Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[26] Ben Taskar,et al. Link Prediction in Relational Data , 2003, NIPS.

[27] M. Stephens,et al. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. , 2003, Genetics.

[28] Michael I. Jordan,et al. Modeling annotated data , 2003, SIGIR.

[29] David A. Forsyth,et al. Matching Words and Pictures , 2003, J. Mach. Learn. Res..

[30] Peter D. Hoff,et al. Latent Space Approaches to Social Network Analysis , 2002 .

[31] Mark Newman,et al. The structure and function of networks , 2002 .

[32] Ben Taskar,et al. Learning Probabilistic Models of Relational Structure , 2001, ICML.

[33] Andrew McCallum,et al. Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[34] P. Donnelly,et al. Inference of population structure using multilocus genotype data. , 2000, Genetics.

[35] James H. Martin,et al. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[36] Jon Kleinberg,et al. Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[37] Michael I. Jordan,et al. An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.