Tractable Bayesian density regression via logit stick-breaking priors

There is a growing interest in learning how the distribution of a response variable changes with a set of predictors. Bayesian nonparametric dependent mixture models provide a flexible approach to address this goal. However, several formulations require computationally demanding algorithms for posterior inference. Motivated by this issue, we study a class of predictor-dependent infinite mixture models, which relies on a simple representation of the stick-breaking prior via sequential logistic regressions. This formulation maintains the same desirable properties of popular predictor-dependent stick-breaking priors, and leverages a recent P\'olya-gamma data augmentation to facilitate the implementation of several computational methods for posterior inference. These routines include Markov chain Monte Carlo via Gibbs sampling, expectation-maximization algorithms, and mean-field variational Bayes for scalable inference, thereby stimulating a wider implementation of Bayesian density regression by practitioners. The algorithms associated with these methods are presented in detail and tested in a toxicology study.

[1]  J. Atchison,et al.  Logistic-normal distributions:Some properties and uses , 1980 .

[2]  Peter Müller,et al.  Semiparametric Bayesian classification with longitudinal markers , 2007, Journal of the Royal Statistical Society. Series C, Applied statistics.

[3]  T. Amemiya QUALITATIVE RESPONSE MODELS: A SURVEY , 1981 .

[4]  Beom Seuk Hwang,et al.  Semiparametric Bayesian joint modeling of a binary and continuous outcome with applications in toxicological risk assessment , 2014, Statistics in medicine.

[5]  M. Escobar,et al.  Bayesian Density Estimation and Inference Using Mixtures , 1995 .

[6]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[7]  S. MacEachern,et al.  An ANOVA Model for Dependent Random Measures , 2004 .

[8]  Lancelot F. James,et al.  Approximate Dirichlet Process Computing in Finite Normal Mixtures , 2002 .

[9]  David B. Dunson,et al.  Logistic Stick-Breaking Process , 2011, J. Mach. Learn. Res..

[10]  Daniele Durante,et al.  Conditionally Conjugate Mean-Field Variational Bayes for Logistic Models , 2017, Statistical Science.

[11]  Stephen G. Walker,et al.  Slice sampling mixture models , 2011, Stat. Comput..

[12]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[13]  Max Welling,et al.  Bayesian k-Means as a Maximization-Expectation Algorithm , 2009, Neural Computation.

[14]  James G. Scott,et al.  Bayesian Inference for Logistic Models Using Pólya–Gamma Latent Variables , 2012, 1205.0310.

[15]  Luis Gutiérrez,et al.  A time dependent Bayesian nonparametric model for air quality analysis , 2016, Comput. Stat. Data Anal..

[16]  A. V. D. Vaart,et al.  Posterior convergence rates of Dirichlet mixtures at smooth densities , 2007, 0708.1885.

[17]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[18]  Aaron Smith,et al.  MCMC for Imbalanced Categorical Data , 2016, Journal of the American Statistical Association.

[19]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .

[20]  Hee Min Choi,et al.  The Polya-Gamma Gibbs sampler for Bayesian logistic regression is uniformly ergodic , 2013 .

[21]  S. MacEachern,et al.  Bayesian Nonparametric Spatial Modeling With Dirichlet Process Mixing , 2005 .

[22]  J. E. Griffin,et al.  Order-Based Dependent Dirichlet Processes , 2006 .

[23]  Fernando A. Quintana,et al.  On the Support of MacEachern’s Dependent Dirichlet Processes and Extensions , 2012 .

[24]  Sean Gerrish,et al.  Black Box Variational Inference , 2013, AISTATS.

[25]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[26]  Vivekananda Roy,et al.  Analysis of the Pólya-Gamma block Gibbs sampler for Bayesian logistic linear mixed models , 2017, Statistics & Probability Letters.

[27]  J. Ghosh,et al.  POSTERIOR CONSISTENCY OF DIRICHLET MIXTURES IN DENSITY ESTIMATION , 1999 .

[28]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[29]  David B. Dunson,et al.  Posterior consistency in conditional distribution estimation , 2013, J. Multivar. Anal..

[30]  Michael I. Jordan,et al.  Linear Response Methods for Accurate Covariance Estimates from Mean Field Variational Bayes , 2015, NIPS.

[31]  D. Dunson,et al.  Convex mixture regression for quantitative risk assessment , 2017, Biometrics.

[32]  David B Dunson,et al.  Nonparametric Bayesian models through probit stick-breaking processes. , 2011, Bayesian analysis.

[33]  J. Pitman,et al.  The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator , 1997 .

[34]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[35]  D. Dunson,et al.  Kernel stick-breaking processes. , 2008, Biometrika.

[36]  Stephen G. Walker,et al.  Sampling the Dirichlet Mixture Model with Slices , 2006, Commun. Stat. Simul. Comput..

[37]  J. Ibrahim,et al.  Conjugate priors for generalized linear models , 2003 .

[38]  Vivekananda Roy,et al.  Geometric ergodicity of Polya-Gamma Gibbs sampler for Bayesian logistic regression with a flat prior , 2018, 1802.06248.

[39]  Michael I. Jordan,et al.  Bayesian parameter estimation via variational methods , 2000, Stat. Comput..

[40]  David B. Dunson,et al.  Improving prediction from dirichlet process mixtures via enrichment , 2014, J. Mach. Learn. Res..

[41]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[42]  Arnaud Doucet,et al.  Bayesian Inference for Dynamic Models with Dirichlet Process Mixtures , 2006, 2006 9th International Conference on Information Fusion.

[43]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[44]  Haibo Zhou,et al.  Association between maternal serum concentration of the DDT metabolite DDE and preterm and small-for-gestational-age babies at birth , 2001, The Lancet.

[45]  Scott W. Linderman,et al.  Dependent Multinomial Models Made Easy: Stick-Breaking with the Polya-gamma Augmentation , 2015, NIPS.

[46]  Gerhard Tutz,et al.  Sequential models in categorical regression , 1991 .

[47]  Stephen G. Walker,et al.  A Bayesian Nonparametric Regression Model With Normalized Weights: A Study of Hippocampal Atrophy in Alzheimer’s Disease , 2014 .

[48]  P. Billingsley,et al.  Probability and Measure , 1980 .

[49]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[50]  Jim E. Griffin,et al.  Stick-breaking autoregressive processes , 2011 .

[51]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .