Annealing Paths for the Evaluation of Topic Models

Statistical topic models such as latent Dirich-let allocation have become enormously popular in the past decade, with dozens of learning algorithms and extensions being proposed each year. As these models and algorithms continue to be developed, it becomes increasingly important to evaluate them relative to previous techniques. However, evaluating the predictive performance of a topic model is a computationally difficult task. Annealed importance sampling (AIS), a Monte Carlo technique which operates by annealing between two distributions, has previously been successfully used for topic model evaluation (Wallach et al., 2009b). This technique estimates the likelihood of a held-out document by simulating an annealing process from the prior to the posterior for the latent topic assignments, and using this simulation as an importance sampling proposal distribution. In this paper we introduce new AIS annealing paths which instead anneal from one topic model to another, thereby estimating the relative performance of the models. This strategy can exhibit much lower empirical variance than previous approaches, facilitating reliable per-document comparisons of topic models. We then show how to use these paths to evaluate the predictive performance of topic model learning algorithms by efficiently estimating the likelihood at each iteration of the training procedure. The proposed method achieves better held-out likelihood estimates for this task than previous algorithms with, in some cases, an order of magnitude less computation.

[1]  Jason Baldridge,et al.  A recursive estimate for the predictive likelihood in a topic model , 2013, AISTATS.

[2]  Philip Resnik,et al.  Modeling topic control to detect influence in conversations using nonparametric topic models , 2014, Machine Learning.

[3]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[4]  Andrew McCallum,et al.  Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression , 2008, UAI.

[5]  Timothy Baldwin,et al.  Automatic Evaluation of Topic Coherence , 2010, NAACL.

[6]  Justin Grimmer,et al.  A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases , 2010, Political Analysis.

[7]  Ruslan Salakhutdinov,et al.  Annealing between distributions by averaging moments , 2013, NIPS.

[8]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[9]  James R. Foulds,et al.  Latent Variable Modeling for Networks and Text: Algorithms, Models and Evaluation Techniques , 2014 .

[10]  Dragomir R. Radev,et al.  The ACL anthology network corpus , 2009, Language Resources and Evaluation.

[11]  Chong Wang,et al.  Reading Tea Leaves: How Humans Interpret Topic Models , 2009, NIPS.

[12]  Radford M. Neal Annealed importance sampling , 1998, Stat. Comput..

[13]  Jeffrey Heer,et al.  Differentiating language usage through topic models , 2013 .

[14]  Andrew McCallum,et al.  Optimizing Semantic Coherence in Topic Models , 2011, EMNLP.

[15]  Thomas L. Griffiths,et al.  The Author-Topic Model for Authors and Documents , 2004, UAI.

[16]  James R. Foulds,et al.  Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation , 2013, KDD.

[17]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[18]  Wray L. Buntine Estimating Likelihoods for Topic Models , 2009, ACML.

[19]  Andrew McCallum,et al.  Rethinking LDA: Why Priors Matter , 2009, NIPS.

[20]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..