Spectral Methods for Supervised Topic Models

Supervised topic models simultaneously model the latent topic structure of large collections of documents and a response variable associated with each document. Existing inference methods are based on either variational approximation or Monte Carlo sampling. This paper presents a novel spectral decomposition algorithm to recover the parameters of supervised latent Dirichlet allocation (sLDA) models. The Spectral-sLDA algorithm is provably correct and computationally efficient. We prove a sample complexity bound and subsequently derive a sufficient condition for the identifiability of sLDA. Thorough experiments on a diverse range of synthetic and real-world datasets verify the theory and demonstrate the practical effectiveness of the algorithm.

[1]  David M. Blei,et al.  Supervised Topic Models , 2007, NIPS.

[2]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[3]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[4]  Sanjeev Arora,et al.  A Practical Algorithm for Topic Modeling with Provable Guarantees , 2012, ICML.

[5]  Sanjeev Arora,et al.  Computing a nonnegative matrix factorization -- provably , 2011, STOC '12.

[6]  Anima Anandkumar,et al.  A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[7]  Jure Leskovec,et al.  From amateurs to connoisseurs: modeling the evolution of user expertise through online reviews , 2013, WWW.

[8]  Anima Anandkumar,et al.  Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation , 2012, NIPS 2012.

[9]  Max Welling,et al.  Fast collapsed gibbs sampling for latent dirichlet allocation , 2008, KDD.

[10]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[11]  Anima Anandkumar,et al.  Tensor Decompositions for Learning Latent Variable Models (A Survey for ALT) , 2015, ALT.

[12]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[13]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[14]  Joel A. Tropp,et al.  Factoring nonnegative matrices with linear programs , 2012, NIPS.

[15]  Chong Wang,et al.  Simultaneous image classification and annotation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Ning Chen,et al.  Gibbs max-margin topic models with data augmentation , 2013, J. Mach. Learn. Res..

[17]  Percy Liang,et al.  Spectral Experts for Estimating Mixtures of Linear Regressions , 2013, ICML.

[18]  Michael I. Jordan,et al.  DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification , 2008, NIPS.

[19]  Sanjeev Arora,et al.  Learning Topic Models -- Going beyond SVD , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.

[20]  Shay B. Cohen,et al.  Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs , 2012, NIPS.

[21]  Eric P. Xing,et al.  MedLDA: maximum margin supervised topic models , 2012, J. Mach. Learn. Res..

[22]  S. Leurgans,et al.  A Decomposition for Three-Way Arrays , 1993, SIAM J. Matrix Anal. Appl..