A (stochastic) Em in General

A (Stochastic) EM in General Expectation-Maximization (EM) is an iterative method for finding the maximum likelihood or maximum a posteriori (MAP) estimates of the parameters in statistical models when data is only partially, or when model depends on unobserved latent variables. This section is inspired from lecture of Dr Namrata Vaswani available at http://www.ece.iastate.edu/∼namrata/EE527 Spring08/emlecture.pdf. We derive EM algorithm for a very general class of model. Let us define all the quantities of interest.

[1]  Joseph Tassarotti,et al.  Efficient Training of LDA on a GPU by Mean-for-Mode Estimation , 2015, ICML.

[2]  Rajarshi Das,et al.  Gaussian LDA for Topic Models with Word Embeddings , 2015, ACL.

[3]  Inderjit S. Dhillon,et al.  A Scalable Asynchronous Distributed Algorithm for Topic Modeling , 2014, WWW.

[4]  Tie-Yan Liu,et al.  LightLDA: Big Topic Models on Modest Computer Clusters , 2014, WWW.

[5]  Jean Mairesse,et al.  Around probabilistic cellular automata , 2014, Theor. Comput. Sci..

[6]  Alexander J. Smola,et al.  Reducing the sampling complexity of topic models , 2014, KDD.

[7]  Yee Whye Teh,et al.  Stochastic Gradient Riemannian Langevin Dynamics on the Probability Simplex , 2013, NIPS.

[8]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..

[9]  David M. Blei,et al.  Sparse stochastic inference for latent Dirichlet allocation , 2012, ICML.

[10]  Jordan L. Boyd-Graber,et al.  Mr. LDA: a flexible large scale topic modeling package using variational inference in MapReduce , 2012, WWW.

[11]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[12]  Arthur Gretton,et al.  Parallel Gibbs Sampling: From Colored Fields to Thin Junction Trees , 2011, AISTATS.

[13]  Alexander J. Smola,et al.  An architecture for parallel topic models , 2010, Proc. VLDB Endow..

[14]  Max Welling,et al.  Distributed Algorithms for Topic Models , 2009, J. Mach. Learn. Res..

[15]  Andrew McCallum,et al.  Efficient methods for topic model inference on streaming document collections , 2009, KDD.

[16]  Yee Whye Teh,et al.  On Smoothing and Inference for Topic Models , 2009, UAI.

[17]  Miklós Csürös,et al.  Approximate Counting with a Floating-Point Counter , 2009, COCOON.

[18]  Sanjay Ghemawat,et al.  MapReduce: simplified data processing on large clusters , 2008, CACM.

[19]  John F. Canny,et al.  GaP: a factor model for discrete data , 2004, SIGIR '04.

[20]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[21]  P. Louis Automates Cellulaires Probabilistes : mesures stationnaires, mesures de Gibbs associées et ergodicité , 2002 .

[22]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[23]  S. Nielsen The stochastic EM algorithm: estimation and asymptotic results , 2000 .

[24]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[25]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[26]  T. Kohlmann,et al.  Latent class analysis in medical research , 1996, Statistical methods in medical research.

[27]  Michael D. Vose,et al.  A Linear Algorithm For Generating Random Numbers With a Given Distribution , 1991, IEEE Trans. Software Eng..

[28]  J. Lebowitz,et al.  Statistical mechanics of probabilistic cellular automata , 1990 .

[29]  B. Derrida,et al.  Finite size scaling study of dynamical phase transitions in two dimensional models : Ferromagnet, symmetric and non symmetric spin glasses , 1988 .

[30]  Robert H. Morris,et al.  Counting large numbers of events in small registers , 1978, CACM.

[31]  M. Woodbury,et al.  Mathematical typology: a grade of membership technique for obtaining disease definition. , 1978, Computers and biomedical research, an international journal.

[32]  D. Dawson Synchronous and Asynchronous Reversible Markov Systems(1) , 1975, Canadian Mathematical Bulletin.

[33]  H. Robbins A Stochastic Approximation Method , 1951 .

[34]  Thomas Hofmann,et al.  A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2007 .

[35]  R. Salakhutdinov,et al.  Relationship between gradient and EM steps in latent variable models , 2003 .

[36]  M. Escobar,et al.  Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[37]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[38]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[39]  G. Vichniac Simulating physics with cellular automata , 1984 .