论文信息 - Sequential Local Learning for Latent Graphical Models

Sequential Local Learning for Latent Graphical Models

Learning parameters of latent graphical models (GM) is inherently much harder than that of no-latent ones since the latent variables make the corresponding log-likelihood non-concave. Nevertheless, expectation-maximization schemes are popularly used in practice, but they are typically stuck in local optima. In the recent years, the method of moments have provided a refreshing angle for resolving the non-convex issue, but it is applicable to a quite limited class of latent GMs. In this paper, we aim for enhancing its power via enlarging such a class of latent GMs. To this end, we introduce two novel concepts, coined marginalization and conditioning, which can reduce the problem of learning a larger GM to that of a smaller one. More importantly, they lead to a sequential learning framework that repeatedly increases the learning portion of given latent GM, and thus covers a significantly broader and more complicated class of loopy latent GMs which include convolutional and random regular models.

[1] Michael I. Jordan,et al. Kernel independent component analysis , 2003 .

[2] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[3] Le Song,et al. Hilbert Space Embeddings of Hidden Markov Models , 2010, ICML.

[4] Robert G. Gallager,et al. Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[5] R. Redner,et al. Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[6] Anima Anandkumar,et al. A Spectral Algorithm for Latent Dirichlet Allocation , 2012, Algorithmica.

[7] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.

[8] Mehryar Mohri,et al. Spectral Learning of General Weighted Automata via Constrained Matrix Completion , 2012, NIPS.

[9] Percy Liang,et al. Spectral Experts for Estimating Mixtures of Linear Regressions , 2013, ICML.

[10] Padhraic Smyth,et al. From Data Mining to Knowledge Discovery in Databases , 1996, AI Mag..

[11] Le Song,et al. Kernel Embeddings of Latent Tree Graphical Models , 2011, NIPS.

[12] Le Song,et al. A Spectral Algorithm for Latent Junction Trees , 2012, UAI.

[13] Ryan P. Adams,et al. Contrastive Learning Using Spectral Methods , 2013, NIPS.

[14] Brendan J. Frey,et al. Iterative Decoding of Compound Codes by Probability Propagation in Graphical Models , 1998, IEEE J. Sel. Areas Commun..

[15] N. Wormald,et al. Models of the , 2010 .

[16] Elchanan Mossel,et al. Learning nonsingular phylogenies and hidden Markov models , 2005, STOC '05.

[17] David Sontag,et al. Unsupervised Learning of Noisy-Or Bayesian Networks , 2013, UAI.

[18] Michael I. Jordan,et al. Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[19] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[20] Le Song,et al. A Spectral Algorithm for Latent Tree Graphical Models , 2011, ICML.

[21] Dean Alderucci. A SPECTRAL ALGORITHM FOR LEARNING HIDDEN MARKOV MODELS THAT HAVE SILENT STATES , 2015 .

[22] Percy Liang,et al. Estimating Latent-Variable Graphical Models using Moments and Likelihoods , 2014, ICML.

[23] Sham M. Kakade,et al. Learning mixtures of spherical gaussians: moment methods and spectral decompositions , 2012, ITCS '13.

[24] Anima Anandkumar,et al. A Method of Moments for Mixture Models and Hidden Markov Models , 2012, COLT.

[25] Le Song,et al. Nonparametric Estimation of Multi-View Latent Variable Models , 2013, ICML.

[26] Kamalika Chaudhuri,et al. Spectral Learning of Large Structured HMMs for Comparative Epigenomics , 2015, NIPS.

[27] Byron Boots,et al. Reduced-Rank Hidden Markov Models , 2009, AISTATS.

[28] Pierre Comon,et al. Handbook of Blind Source Separation: Independent Component Analysis and Applications , 2010 .

[29] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[30] N. Wormald. Models of random regular graphs , 2010 .

[31] Francis R. Bach,et al. Rethinking LDA: Moment Matching for Discrete ICA , 2015, NIPS.

[32] Anima Anandkumar,et al. Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[33] Michael I. Jordan,et al. Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[34] Geoffrey E. Hinton,et al. Deep Boltzmann Machines , 2009, AISTATS.

[35] William T. Freeman,et al. Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.