Conjugate and conditional conjugate Bayesian analysis of discrete graphical models of marginal independence

We propose a conjugate and conditional conjugate Bayesian analysis of models of marginal independence with a bi-directed graph representation. We work with Markov equivalent directed acyclic graphs (DAGs) obtained using the same vertex set with the addition of some latent vertices when required. The DAG equivalent model is characterised by a minimal set of marginal and conditional probability parameters. This allows us to use compatible prior distributions based on products of Dirichlet distributions. For models with DAG representation on the same vertex set, the posterior distribution and the marginal likelihood is analytically available, while for the remaining ones a data augmentation scheme introducing additional latent variables is required. For the latter, we estimate the marginal likelihood using Chib’s (1995) estimator. Additional implementation details including identifiability of such models is discussed. For all models, we also provide methodology for the computation of the posterior distributions of the marginal log-linear parameters based on a simple transformation of the simulated values of the probability parameters. We illustrate our method using a popular 4-way dataset.

[1]  A. Roverato,et al.  Log-mean linear models for binary data , 2011, 1109.6239.

[2]  Weixin Yao,et al.  Model based labeling for mixture models , 2012, Stat. Comput..

[3]  Steffen L. Lauritzen,et al.  Bayesian methods with applications to science, policy and official statistics, ISBA 2000 , 2000 .

[4]  M W Knuiman,et al.  Incorporating prior information into the analysis of contingency tables. , 1988, Biometrics.

[5]  Guido Consonni,et al.  Compatibility of prior specifications across linear models , 2008, 1102.2981.

[6]  George Iliopoulos,et al.  An Artificial Allocations Based Solution to the Label Switching Problem in Bayesian Analysis of Mixtures of Distributions , 2010 .

[7]  S. Chib Marginal Likelihood from the Gibbs Output , 1995 .

[8]  Ajay Jasra,et al.  Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling , 2005 .

[9]  Judea Pearl,et al.  When can association graphs admit a causal interpretation , 1994 .

[10]  Jin Tian,et al.  On the Testable Implications of Causal Models with Hidden Variables , 2002, UAI.

[11]  J. Berger,et al.  Optimal predictive model selection , 2004, math/0406464.

[12]  Francesco Bartolucci,et al.  Bayesian inference through encompassing priors and importance sampling for a class of marginal models for categorical data , 2012, Comput. Stat. Data Anal..

[13]  A. Coppen The Marke-Nyman temperament scale: an English translation. , 1966, The British journal of medical psychology.

[14]  Alberto Roverato,et al.  Comaptible Prior Distributions for DAG Models , 2001 .

[15]  Guido Consonni,et al.  Compatible prior distributions for directed acyclic graph models , 2004 .

[16]  T. Richardson,et al.  Marginal log‐linear parameters for graphical Markov models , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[17]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[18]  P. Dellaportas,et al.  Markov chain Monte Carlo model determination for hierarchical and graphical log-linear models , 1999 .

[19]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[20]  Zoubin Ghahramani,et al.  The Hidden Life of Latent Variables: Bayesian Learning with Mixed Graph Models , 2009, J. Mach. Learn. Res..

[21]  S. Orbom,et al.  When Can Association Graphs Admit A Causal Interpretation? , 1993 .

[22]  T. Richardson,et al.  Binary models for marginal independence , 2007, 0707.3794.

[23]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[24]  D. Lindley A STATISTICAL PARADOX , 1957 .

[25]  Wicher P. Bergsma,et al.  Marginal models for categorical data , 2002 .

[26]  M. D. Martínez-Miranda,et al.  Computational Statistics and Data Analysis , 2009 .

[27]  Scott A. Sisson,et al.  Transdimensional Markov Chains , 2005 .

[28]  Robin J. Evans,et al.  Graphical methods for inequality constraints in marginalized DAGs , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[29]  Maya R. Gupta,et al.  Introduction to the Dirichlet Distribution and Related Processes , 2010 .

[30]  Steffen L. Lauritzen,et al.  Compatible prior distributions , 2000 .

[31]  M. Stephens Dealing with label switching in mixture models , 2000 .

[32]  I. Ntzoufras,et al.  Bayesian Analysis of Graphical Models of Marginal Independence for Three Way Contingency Tables , 2012 .

[33]  T. Richardson Markov Properties for Acyclic Directed Mixed Graphs , 2003 .

[34]  M. Bartlett A comment on D. V. Lindley's statistical paradox , 1957 .

[35]  Christian Robert,et al.  Approximating the marginal likelihood in mixture models , 2007 .

[36]  Aram Galstyan,et al.  A Sequence of Relaxations Constraining Hidden Variable Models , 2011, UAI 2011.

[37]  Monia Lupparelli,et al.  Graphical models of marginal independence for categorical variables , 2005 .

[38]  Monia Lupparelli,et al.  Parameterizations and Fitting of Bi‐directed Graph Models to Categorical Data , 2008, 0801.1440.

[39]  N. Wermuth Model Search among Multiplicative Models , 1976 .

[40]  Tamás Rudas,et al.  On applications of marginal models for categorical data , 2004 .

[41]  S. Frühwirth-Schnatter Markov chain Monte Carlo Estimation of Classical and Dynamic Switching and Mixture Models , 2001 .

[42]  Wilfred Perks,et al.  Some observations on inverse probability including a new indifference rule , 1947 .

[43]  G'erard Letac,et al.  Wishart distributions for decomposable graphs , 2007, 0708.2380.

[44]  Kshitij Khare,et al.  Wishart distributions for decomposable covariance graph models , 2011, 1103.1768.

[45]  Jin Tian,et al.  Inequality Constraints in Causal Models with Hidden Variables , 2006, UAI.

[46]  N. Wermuth,et al.  Linear Dependencies Represented by Chain Graphs , 1993 .

[47]  Thomas S. Richardson,et al.  Graphical Methods for Efficient Likelihood Inference in Gaussian Covariance Models , 2007, J. Mach. Learn. Res..

[48]  Scott A. Sisson,et al.  Trans-dimensional Markov chains : A decade of progress and future perspectives , 2004 .

[49]  Zoubin Ghahramani,et al.  Factorial Mixture of Gaussians and the Marginal Independence Model , 2009, AISTATS.

[50]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.