A Stick-Breaking Likelihood for Categorical Data Analysis with Latent Gaussian Models

The development of accurate models and ecient algorithms for the analysis of multivariate categorical data are important and longstanding problems in machine learning and computational statistics. In this paper, we focus on modeling categorical data using Latent Gaussian Models (LGMs). We propose a novel stick-breaking likelihood function for categorical LGMs that exploits accurate linear and quadratic bounds on the logistic log-partition function, leading to an eective variational inference and learning framework. We thoroughly compare our approach to existing algorithms for multinomial logit/probit likelihoods on several problems, including inference in multinomial Gaussian process classication and learning in latent factor models. Our extensive comparisons demonstrate that our stick-breaking model eectively captures correlation in discrete data and is well suited for the analysis of categorical data.

[1]  Thomas L. Griffiths,et al.  Hierarchical Topic Models and the Nested Chinese Restaurant Process , 2003, NIPS.

[2]  Guillaume Bouchard Efficient Bounds for the Softmax Function and Applications to Approximate Inference in Hybrid models , 2008 .

[3]  John D. Lafferty,et al.  Correlated Topic Models , 2005, NIPS.

[4]  Michael I. Jordan,et al.  Sparse Gaussian Process Classification With Multiple Classes , 2004 .

[5]  Geoffrey E. Hinton,et al.  A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[6]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[7]  C. Rasmussen,et al.  Approximations for Binary Gaussian Process Classification , 2008 .

[8]  Magnus Rattray,et al.  Inference algorithms and learning theory for Bayesian sparse factor analysis , 2009 .

[9]  C. Carvalho,et al.  Sparse Factor-Analytic Probit Models , 2008 .

[10]  D. Böhning Multinomial logistic regression algorithm , 1992 .

[11]  Mohammad Emtiyaz Khan,et al.  Variational bounds for mixed-data factor analysis , 2010, NIPS.

[12]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[13]  Mark Girolami,et al.  Variational Bayesian Multinomial Probit Regression with Gaussian Process Priors , 2006, Neural Computation.

[14]  Neil D. Lawrence,et al.  Efficient Nonparametric Bayesian Modelling with Sparse Gaussian Process Approximations , 2006 .

[15]  S. Frühwirth-Schnatter,et al.  Data Augmentation and MCMC for Binary and Multinomial Logit Models , 2010 .

[16]  C. Holmes,et al.  Bayesian auxiliary variable models for binary and multinomial regression , 2006 .

[17]  W. Vijverberg,et al.  Betit: A Family that Nests Probit and Logit , 2000, SSRN Electronic Journal.

[18]  Mohammad Emtiyaz Khan,et al.  Piecewise Bounds for Estimating Bernoulli-Logistic Latent Gaussian Models , 2011, ICML.

[19]  Jon D. McAuliffe,et al.  Variational Inference for Large-Scale Models of Discrete Choice , 2007, 0712.2526.

[20]  Katherine A. Heller,et al.  Bayesian Exponential Family PCA , 2008, NIPS.

[21]  Manfred Opper,et al.  The Variational Gaussian Approximation Revisited , 2009, Neural Computation.

[22]  Stan Lipovetsky,et al.  Latent Variable Models and Factor Analysis , 2001, Technometrics.

[23]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[24]  S. L. Scott Data augmentation, frequentist estimation, and the Bayesian analysis of multinomial logit models , 2011 .

[25]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[26]  Sanjoy Dasgupta,et al.  A Generalization of Principal Components Analysis to the Exponential Family , 2001, NIPS.

[27]  D. Bartholomew Latent Variable Models And Factor Analysis , 1987 .

[28]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[29]  Niall M. Adams,et al.  Likelihood inference in nearest‐neighbour classification models , 2003 .

[30]  M. Knott,et al.  Generalized latent trait models , 2000 .

[31]  James G. Scott,et al.  Feature-Inclusion Stochastic Search for Gaussian Graphical Models , 2008 .

[32]  J. Sethuraman A CONSTRUCTIVE DEFINITION OF DIRICHLET PRIORS , 1991 .

[33]  Amr Ahmed,et al.  On Tight Approximate Inference of the Logistic-Normal Topic Admixture Model , 2007 .

[34]  M. Wedel,et al.  Factor analysis with (mixed) observed and latent variables in the exponential family , 2001 .

[35]  S. Chib,et al.  Bayesian analysis of binary and polychotomous response data , 1993 .