Incorporating Marginal Prior Information in Latent Class Models

We present an approach to incorporating informative prior beliefs about marginal probabilities into Bayesian latent class models for categorical data. The basic idea is to append synthetic observations to the original data such that (i) the empirical distributions of the desired margins match those of the prior beliefs, and (ii) the values of the remaining variables are left missing. The degree of prior uncertainty is controlled by the number of augmented records. Posterior inferences can be obtained via typical MCMC algorithms for latent class models, tailored to deal efficiently with the missing values in the concatenated data. We illustrate the approach using a variety of simulations based on data from the American Community Survey, including an example of how augmented records can be used to fit latent class models to data from stratified samples.

[1]  David B. Dunson,et al.  Nonparametric Bayes regression and classification through mixtures of product kernels , 2010 .

[2]  van der Ark,et al.  9. Multiple Imputation of Incomplete Categorical Data Using Latent Class Analysis , 2008 .

[3]  Jingchen Hu,et al.  Dirichlet Process Mixture Models for Nested Categorical Data , 2015 .

[4]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[5]  Jerome P. Reiter,et al.  Bayesian multiple imputation for large-scale categorical data with structural zeros , 2013 .

[6]  Sander Greenland,et al.  Prior data for non‐normal priors , 2007, Statistics in medicine.

[7]  Sonia Petrone,et al.  An enriched conjugate prior for Bayesian nonparametric inference , 2011 .

[8]  L. A. Goodman Exploratory latent structure analysis using both identifiable and unidentifiable models , 1974 .

[9]  David Dunson,et al.  Bayesian Factorizations of Big Sparse Tensors , 2013, Journal of the American Statistical Association.

[10]  D. Dunson,et al.  Nonparametric Bayes Modeling of Multivariate Categorical Data , 2009, Journal of the American Statistical Association.

[11]  Jerome P. Reiter,et al.  Semi-parametric Selection Models for Potentially Non-ignorable Attrition in Panel Studies with Refreshment Samples , 2015, Political Analysis.

[12]  Stephen G. Walker,et al.  Slice sampling mixture models , 2011, Stat. Comput..

[13]  M. Wedel,et al.  Statistical Data Fusion for Cross-Tabulation , 1997 .

[14]  Alexander Hehmeyer,et al.  Nonparametric Bayesian Multiple Imputation for Incomplete Categorical Variables in Large-Scale Assessment Surveys , 2013 .

[15]  Tsuyoshi Kunihama,et al.  Bayesian Modeling of Temporal Dependence in Large Sparse Contingency Tables , 2012, Journal of the American Statistical Association.

[16]  O. Papaspiliopoulos A note on posterior sampling from Dirichlet mixture models , 2008 .

[17]  T. Kunihama,et al.  Nonparametric Bayes modeling with sample survey weights. , 2014, Statistics & probability letters.

[18]  Peter D. Hoff,et al.  Marginally specified priors for non‐parametric Bayesian estimation , 2012, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[19]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[20]  Stephen G. Walker,et al.  Sampling the Dirichlet Mixture Model with Slices , 2006, Commun. Stat. Simul. Comput..

[21]  Radford M. Neal,et al.  A Split-Merge Markov chain Monte Carlo Procedure for the Dirichlet Process Mixture Model , 2004 .

[22]  Jerome P. Reiter,et al.  Bayesian Estimation of Discrete Multivariate Latent Structure Models With Structural Zeros , 2014 .

[23]  Mulugeta Gebregziabher,et al.  Latent class based multiple imputation approach for missing categorical data. , 2010, Journal of statistical planning and inference.