Streaming Variational Bayes

We present SDA-Bayes, a framework for (S)treaming, (D)istributed, (A)synchronous computation of a Bayesian posterior. The framework makes streaming updates to the estimated posterior according to a user-specified approximation batch primitive. We demonstrate the usefulness of our framework, with variational Bayes (VB) as the primitive, by fitting the latent Dirichlet allocation model to two large-scale document collections. We demonstrate the advantages of our algorithm over stochastic variational inference (SVI) by comparing the two after a single pass through a known amount of data—a case where SVI may be applied—and in the streaming setting, where SVI does not apply.

[1]  Michael I. Jordan Graphical Models , 2003 .

[2]  Manfred Opper,et al.  A Bayesian approach to on-line learning , 1999 .

[3]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[4]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[5]  Tom Minka,et al.  Expectation-Propogation for the Generative Aspect Model , 2002, UAI.

[6]  Antti Honkela,et al.  On-line Variational Bayesian Learning , 2003 .

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Aleks Jakulin,et al.  Applying Discrete PCA in Data Analysis , 2004, UAI.

[9]  M. Seeger Expectation Propagation for Exponential Families , 2005 .

[10]  Yee Whye Teh,et al.  A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation , 2006, NIPS.

[11]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[12]  Yee Whye Teh,et al.  On Smoothing and Inference for Topic Models , 2009, UAI.

[13]  Thomas L. Griffiths,et al.  Online Inference of Topics with Latent Dirichlet Allocation , 2009, AISTATS.

[14]  Francis R. Bach,et al.  Online Learning for Latent Dirichlet Allocation , 2010, NIPS.

[15]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[16]  Chong Wang,et al.  Online Variational Inference for the Hierarchical Dirichlet Process , 2011, AISTATS.

[17]  Purnamrita Sarkar,et al.  The Big Data Bootstrap , 2012, ICML.

[18]  M. Wand,et al.  Real-Time Semiparametric Regression , 2012, 1209.3550.

[19]  Chong Wang,et al.  An Adaptive Learning Rate for Stochastic Variational Inference , 2013, ICML.

[20]  Chong Wang,et al.  Stochastic variational inference , 2012, J. Mach. Learn. Res..