Rank-Integrated Topic Modeling: A General Framework

Rank-integrated topic models which incorporate link structures into topic modeling through topical ranking have shown promising performance comparing to other link combined topic models. However, existing work on rank-integrated topic modeling treats ranking as document distribution for topic, and therefore can’t integrate topical ranking with LDA model, which is one of the most popular topic models. In this paper, we introduce a new method to integrate topical ranking with topic modeling and propose a general framework for topic modeling of documents with link structures. By interpreting the normalized topical ranking score vectors as topic distributions for documents, we fuse ranking into topic modeling in a general framework. Under this general framework, we construct two rank-integrated PLSA models and two rank-integrated LDA models, and present the corresponding learning algorithms. We apply our models on four real datasets and compare them with baseline topic models and the state-of-the-art link combined topic models in generalization performance, document classification, document clustering and topic interpretability. Experiments show that all rank-integrated topic models perform better than baseline models, and rank-integrated LDA models outperform all the compared models.

[1]  Venkatesh Saligrama,et al.  A Topic Modeling Approach to Ranking , 2015, AISTATS.

[2]  Rui Zhang,et al.  Incorporating Knowledge Graph Embeddings into Topic Modeling , 2017, AAAI.

[3]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[4]  Brian D. Davison,et al.  Topical link analysis for web search , 2006, SIGIR.

[5]  Yizhou Sun,et al.  iTopicModel: Information Network-Integrated Topic Modeling , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[6]  David M. Blei,et al.  Hierarchical relational models for document networks , 2009, 0909.4331.

[7]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[8]  Ruixuan Li,et al.  RankTopic: Ranking Based Topic Modeling , 2012, 2012 IEEE 12th International Conference on Data Mining.

[9]  Tao Jin,et al.  Collaborative topic regression for online recommender systems: an online and Bayesian approach , 2017, Machine Learning.

[10]  Ying Huang,et al.  Efficient Correlated Topic Modeling with Topic Embedding , 2017, KDD.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  Tieniu Tan,et al.  Social-Relational Topic Model for Social Networks , 2015, CIKM.

[13]  Aixin Sun,et al.  Enhancing Topic Modeling for Short Texts with Auxiliary Word Embeddings , 2017, ACM Trans. Inf. Syst..

[14]  Yanchun Zhang,et al.  Collaborative Topic Ranking: Leveraging Item Meta-Data for Sparsity Reduction , 2015, AAAI.

[15]  Ruixuan Li,et al.  LIMTopic: A Framework of Incorporating Link Based Importance into Topic Modeling , 2014, IEEE Transactions on Knowledge and Data Engineering.

[16]  John D. Lafferty,et al.  Dynamic topic models , 2006, ICML.