Layer-Assisted Neural Topic Modeling over Document Networks

Neural topic modeling provides a flexible, efficient, and powerful way to extract topic representations from text documents. Unfortunately, most existing models cannot handle the text data with network links, such as web pages with hyperlinks and scientific papers with citations. To resolve this kind of data, we develop a novel neural topic model, namely Layer-Assisted Neural Topic Model (LANTM), which can be interpreted from the perspective of variational auto-encoders. Our major motivation is to enhance the topic representation encoding by not only using text contents, but also the assisted network links. Specifically, LANTM encodes the texts and network links into the topic representations by an augmented network with graph convolutional modules, and decodes them by maximizing the likelihood of the generative process. The neural variational inference is adopted for efficient inference. Experimental results validate that LANTM significantly outperforms the existing models on topic quality, text classification and link prediction.

[1]  Ce Zhang,et al.  Topic Modeling on Document Networks with Adjacent-Encoder , 2020, AAAI.

[2]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[3]  Michael Röder,et al.  Exploring the Space of Topic Coherence Measures , 2015, WSDM.

[4]  Zenglin Xu,et al.  Neural Relational Topic Models for Scientific Article Analysis , 2018, CIKM.

[5]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[6]  Sophie Burkhardt,et al.  Decoupling Sparsity and Smoothness in the Dirichlet Variational Autoencoder Topic Model , 2019, J. Mach. Learn. Res..

[7]  Hao Zhang,et al.  WHAI: Weibull Hybrid Autoencoding Inference for Deep Topic Modeling , 2018, ICLR.

[8]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[9]  Noah D. Goodman,et al.  Amortized Inference in Probabilistic Reasoning , 2014, CogSci.

[10]  Jihong Ouyang,et al.  Sparse Hybrid Variational-Gibbs Algorithm for Latent Dirichlet Allocation , 2016, SDM.

[11]  E. Kandel,et al.  Proceedings of the National Academy of Sciences of the United States of America. Annual subject and author indexes. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[12]  Phil Blunsom,et al.  Discovering Discrete Latent Topics with Neural Variational Inference , 2017, ICML.

[13]  Danushka Bollegala,et al.  Tree-Structured Neural Topic Model , 2020, ACL.

[14]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15]  Alexander Dekhtyar,et al.  Information Retrieval , 2018, Lecture Notes in Computer Science.

[16]  Charles A. Sutton,et al.  Autoencoding Variational Inference For Topic Models , 2017, ICLR.

[17]  Yongfeng Zhang,et al.  Neural Variational Correlated Topic Modeling , 2019, WWW.

[18]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[19]  David M. Mimno,et al.  Applications of Topic Models , 2017, Found. Trends Inf. Retr..

[20]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[21]  Xiao Wang,et al.  Structural Deep Clustering Network , 2020, WWW.

[22]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[23]  Karol Gregor,et al.  Neural Variational Inference and Learning in Belief Networks , 2014, ICML.

[24]  Andrew McCallum,et al.  Automating the Construction of Internet Portals with Machine Learning , 2000, Information Retrieval.

[25]  Daniel Gooch,et al.  Communications of the ACM , 2011, XRDS.

[26]  Max Welling,et al.  Variational Graph Auto-Encoders , 2016, ArXiv.

[27]  David M. Blei,et al.  Relational Topic Models for Document Networks , 2009, AISTATS.

[28]  Gosse Bouma,et al.  48th Annual Meeting of the Association for Computational Linguistics , 2010, ACL 2010.

[29]  David M. Blei,et al.  Topic Modeling in Embedding Spaces , 2019, Transactions of the Association for Computational Linguistics.

[30]  Xiaochun Cao,et al.  Graph Attention Topic Modeling Network , 2020, WWW.