Topic Modeling using Variational Auto-Encoders with Gumbel-Softmax and Logistic-Normal Mixture Distributions

Probabilistic Topic Models are widely applied in many NLP-related tasks due to their effective use of unlabeled data to capture variable dependencies. Analytical solutions for Bayesian inference of such models, however, are usually intractable, hindering the proposition of highly expressive text models. In this scenario, Variational Auto-Encoders (VAEs), where an inference network (the encoder) is used to approximate the posterior distribution, became a promising alternative for inferring latent topic distributions of text documents. These models, however, also pose new challenges such as the requirement of continuous and reparameterizable distributions which may not fit so well the true latent topic distributions. Moreover, inference networks are prone to component collapsing, impairing the collection of coherent topics. To overcome these problems, we propose two new text topic models based on the categorical distribution Gumbel-Softmax (GSDTM) and on mixtures of Logistic-Normal distributions (LMDTM). We also provide a study on the impact of different modeling choices on the generated topics, observing a trade-off between topic coherence and document reconstruction. Through experiments using two reference datasets, we show that GSDTM largely outperforms previous state-of-the-art baselines when considering three different evaluation metrics.

[1]  E. J. Gumbel,et al.  The Maxima of the Mean Largest Value and of the Range , 1954 .

[2]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[3]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[4]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[5]  Geoffrey E. Hinton,et al.  Replicated Softmax: an Undirected Topic Model , 2009, NIPS.

[6]  Claire Cardie,et al.  Multi-aspect Sentiment Analysis with Topic Models , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[7]  Timothy N. Rubin,et al.  Statistical topic models for multi-label document classification , 2011, Machine Learning.

[8]  Hugo Larochelle,et al.  A Neural Autoregressive Topic Model , 2012, NIPS.

[9]  Pengtao Xie,et al.  Integrating Document Clustering and Topic Modeling , 2013, UAI.

[10]  Timothy Baldwin,et al.  Machine Reading Tea Leaves: Automatically Evaluating Topic Coherence and Topic Model Quality , 2014, EACL.

[11]  Tom Minka,et al.  A* Sampling , 2014, NIPS.

[12]  Ingrid Zukerman,et al.  Authorship Attribution with Topic Models , 2014, CL.

[13]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[14]  Min Yang,et al.  Ordering-Sensitive and Semantic-Aware Topic Modeling , 2015, AAAI.

[15]  Murray Shanahan,et al.  Deep Unsupervised Clustering with Gaussian Mixture Variational Autoencoders , 2016, ArXiv.

[16]  Di He,et al.  Sentence Level Recurrent Topic Model: Letting Topics Speak for Themselves , 2016, ArXiv.

[17]  Pieter Abbeel,et al.  InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[18]  Phil Blunsom,et al.  Neural Variational Inference for Text Processing , 2015, ICML.

[19]  Alex Graves,et al.  Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes , 2016, NIPS.

[20]  Charles A. Sutton,et al.  Autoencoding Variational Inference For Topic Models , 2017, ICLR.

[21]  Ben Poole,et al.  Categorical Reparameterization with Gumbel-Softmax , 2016, ICLR.

[22]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.