Slice Sampling for General Completely Random Measures

Completely random measures provide a principled approach to creating flexible unsupervised models, where the number of latent features is infinite and the number of features that influence the data grows with the size of the data set. Due to the infinity the latent features, posterior inference requires either marginalization---resulting in dependence structures that prevent efficient computation via parallelization and conjugacy---or finite truncation, which arbitrarily limits the flexibility of the model. In this paper we present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables, enabling efficient, parallelized computation without sacrificing flexibility. In contrast to past work that achieved this on a model-by-model basis, we provide a general recipe that is applicable to the broad class of completely random measure-based priors. The efficacy of the proposed algorithm is evaluated on several popular nonparametric models, demonstrating a higher effective sample size per second compared to algorithms using marginalization as well as a higher predictive performance compared to models employing fixed truncations.

[1]  Stephen G. Walker,et al.  Slice sampling mixture models , 2011, Stat. Comput..

[2]  Michael I. Jordan,et al.  Combinatorial Clustering and the Beta Negative Binomial Process , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  T. Ferguson,et al.  A Representation of Independent Increment Processes without Gaussian Components , 1972 .

[4]  Richard L. Tweedie,et al.  Markov Chains and Stochastic Stability , 1993, Communications and Control Engineering Series.

[5]  Zoubin Ghahramani,et al.  Accelerated sampling for the Indian Buffet Process , 2009, ICML '09.

[6]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[7]  Michael I. Jordan,et al.  Stick-Breaking Beta Processes and the Poisson Process , 2012, AISTATS.

[8]  Michalis K. Titsias,et al.  The Infinite Gamma-Poisson Feature Model , 2007, NIPS.

[9]  Michael I. Jordan,et al.  Variational inference for Dirichlet process mixtures , 2006 .

[10]  J. Kingman,et al.  Completely random measures. , 1967 .

[11]  Michael I. Jordan Hierarchical Models , Nested Models and Completely Random Measures , 2010 .

[12]  Thomas L. Griffiths,et al.  Infinite latent feature models and the Indian buffet process , 2005, NIPS.

[13]  L. Bondesson On simulation from infinitely divisible distributions , 1982, Advances in Applied Probability.

[14]  Y. Teh,et al.  MCMC for Normalized Random Measure Mixture Models , 2013, 1310.0595.

[15]  Tamara Broderick,et al.  Truncated random measures , 2016, Bernoulli.

[16]  Charles J. Geyer,et al.  Markov Chain Monte Carlo Lecture Notes , 2005 .

[17]  J. Rosenthal,et al.  Harris recurrence of Metropolis-within-Gibbs and trans-dimensional Markov chains , 2006, math/0702412.

[18]  Julyan Arbel,et al.  A moment-matching Ferguson & Klass algorithm , 2016, Stat. Comput..

[19]  J. Griffin,et al.  Posterior Simulation of Normalized Random Measure Mixtures , 2011 .

[20]  Joydeep Ghosh,et al.  Nonparametric Bayesian Factor Analysis for Dynamic Count Matrices , 2015, AISTATS.

[21]  Tamara Broderick,et al.  Completely random measures for modeling power laws in sparse graphs , 2016 .

[22]  Stephen G. Walker,et al.  Sampling the Dirichlet Mixture Model with Slices , 2006, Commun. Stat. Simul. Comput..

[23]  Raffaele Argiento,et al.  A blocked Gibbs sampler for NGG-mixture models via a priori truncation , 2016, Stat. Comput..

[24]  W. Wong,et al.  The calculation of posterior distributions by data augmentation , 1987 .

[25]  A. Dawid Some matrix-variate distribution theory: Notational considerations and a Bayesian application , 1981 .

[26]  Michael I. Jordan,et al.  Posteriors, conjugacy, and exponential families for completely random measures , 2014, Bernoulli.

[27]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[28]  Yee Whye Teh,et al.  Variational Inference for the Indian Buffet Process , 2009, AISTATS.

[29]  J. Rosínski Series Representations of Lévy Processes from the Perspective of Point Processes , 2001 .

[30]  François Caron,et al.  Nonnegative Bayesian nonparametric factor models with completely random measures , 2019, Statistics and computing.

[31]  Brian Kulis,et al.  Gamma Processes, Stick-Breaking, and Variational Inference , 2015, AISTATS.

[32]  David B. Dunson,et al.  Beta-Negative Binomial Process and Poisson Factor Analysis , 2011, AISTATS.

[33]  Tamara Broderick,et al.  Exchangeable Trait Allocations , 2016, 1609.09147.

[34]  James M. Flegal,et al.  Batch means and spectral variance estimators in Markov chain Monte Carlo , 2008, 0811.1729.

[35]  Radford M. Neal,et al.  Splitting and merging components of a nonconjugate Dirichlet process mixture model , 2007 .

[36]  N. Hjort Nonparametric Bayes Estimators Based on Beta Processes in Models for Life History Data , 1990 .

[37]  Svetha Venkatesh,et al.  Factorial Multi-Task Learning : A Bayesian Nonparametric Approach , 2013, ICML.

[38]  Yee Whye Teh,et al.  Stick-breaking Construction for the Indian Buffet Process , 2007, AISTATS.