Learning Latent Variable Models by Improving Spectral Solutions with Exterior Point Method

Probabilistic latent-variable models are a fundamental tool in statistics and machine learning. Despite their widespread use, identifying the parameters of basic latent variable models continues to be an extremely challenging problem. Traditional maximum likelihood-based learning algorithms find valid parameters, but suffer from high computational cost, slow convergence, and local optima. In contrast, recently developed spectral algorithms are computationally efficient and provide strong statistical guarantees, but are not guaranteed to find valid parameters. In this work, we introduce a two-stage learning algorithm for latent variable models. We first use a spectral method of moments algorithm to find a solution that is close to the optimal solution but not necessarily in the valid set of model parameters. We then incrementally refine the solution via an exterior point method until a local optima that is arbitrarily near the valid set of parameters is found. We perform several experiments on synthetic and real-world data and show that our approach is more accurate than previous work, especially when training data is limited.

[1]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[2]  L. R. Rabiner,et al.  An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition , 1983, The Bell System Technical Journal.

[3]  Le Song,et al.  Hilbert Space Embeddings of Hidden Markov Models , 2010, ICML.

[4]  Sham M. Kakade,et al.  Learning mixtures of spherical gaussians: moment methods and spectral decompositions , 2012, ITCS '13.

[5]  Heinz H. Bauschke,et al.  Fixed-Point Algorithms for Inverse Problems in Science and Engineering , 2011, Springer Optimization and Its Applications.

[6]  K. Pearson Contributions to the Mathematical Theory of Evolution , 1894 .

[7]  Karl Stratos,et al.  Experiments with Spectral Learning of Latent-Variable PCFGs , 2013, HLT-NAACL.

[8]  Roman A. Polyak,et al.  Primal–dual exterior point method for convex optimization , 2008, Optim. Methods Softw..

[9]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[10]  Le Song,et al.  Nonparametric Estimation of Multi-View Latent Variable Models , 2013, ICML.

[11]  Ariadna Quattoni,et al.  Local Loss Optimization in Operator Models: A New Insight into Spectral Learning , 2012, ICML.

[12]  Joelle Pineau,et al.  Methods of Moments for Learning Stochastic Languages: Unified Presentation and Empirical Comparison , 2014, ICML.

[13]  R. Fletcher Practical Methods of Optimization , 1988 .

[14]  Le Song,et al.  A Spectral Algorithm for Latent Junction Trees , 2012, UAI.

[15]  Joos Vandewalle,et al.  A Multilinear Singular Value Decomposition , 2000, SIAM J. Matrix Anal. Appl..

[16]  Hiroshi Yamashita,et al.  A Primal-Dual Exterior Point Method for Nonlinear Optimization , 2010, SIAM J. Optim..

[17]  C. Byrne Sequential unconstrained minimization algorithms for constrained optimization , 2008 .

[18]  Yoram Singer,et al.  Efficient projections onto the l1-ball for learning in high dimensions , 2008, ICML '08.

[19]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[20]  Patrick L. Combettes,et al.  Proximal Splitting Methods in Signal Processing , 2009, Fixed-Point Algorithms for Inverse Problems in Science and Engineering.

[21]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[22]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .

[23]  Veronica Bloom Exterior-Point Algorithms for Solving Large-Scale Nonlinear Optimization Problems , 2014 .

[24]  Anima Anandkumar,et al.  A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[25]  Byron Boots,et al.  Reduced-Rank Hidden Markov Models , 2009, AISTATS.

[26]  Tong Zhang,et al.  Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization , 2013, Mathematical Programming.

[27]  Le Song,et al.  A Spectral Algorithm for Latent Tree Graphical Models , 2011, ICML.

[28]  Dean Alderucci A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[29]  Anima Anandkumar,et al.  Two SVDs Suffice: Spectral decompositions for probabilistic topic modeling and latent Dirichlet allocation , 2012, NIPS 2012.