Bayesian Learning with Mixtures of Trees

We present a Bayesian method for learning mixtures of graphical models. In particular, we focus on data clustering with a tree-structured model for each cluster. We use a Markov chain Monte Carlo method to draw a sample of clusterings, while the likelihood of a clustering is computed by exact averaging over the model class, including the dependency structure on the variables. Experiments on synthetic data show that this method usually outperforms the expectation–maximization algorithm by Meilă and Jordan [1] when the number of observations is small (hundreds) and the number of variables is large (dozens). We apply the method to study how much single nucleotide polymorphisms carry information about the structure of human populations.

[1]  Ajay Jasra,et al.  Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling , 2005 .

[2]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .

[3]  Erich Kaltofen,et al.  ON THE COMPLEXITY OF COMPUTING DETERMINANTS , 2001 .

[4]  F. Lutzoni,et al.  Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic confidence. , 2003, Molecular biology and evolution.

[5]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[6]  Ramón López de Mántaras,et al.  TAN Classifiers Based on Decomposable Distributions , 2005, Machine Learning.

[7]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[8]  K J Dawson,et al.  A Bayesian approach to the identification of panmictic populations and the assignment of individuals. , 2001, Genetical research.

[9]  Marina Meila,et al.  Comparing Clusterings by the Variation of Information , 2003, COLT.

[10]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[11]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[12]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[13]  N. Camp,et al.  Graphical modeling of the joint distribution of alleles at associated loci. , 2004, American journal of human genetics.

[14]  C. Holmes,et al.  MCMC and the Label Switching Problem in Bayesian Mixture Modelling 1 Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modelling , 2004 .

[15]  C. Robert,et al.  Estimation of Finite Mixture Distributions Through Bayesian Sampling , 1994 .

[16]  Michael I. Jordan,et al.  Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..

[17]  Carl E. Rasmussen,et al.  The Infinite Gaussian Mixture Model , 1999, NIPS.

[18]  Tommi S. Jaakkola,et al.  Tractable Bayesian learning of tree belief networks , 2000, Stat. Comput..

[19]  Radford M. Neal Markov Chain Sampling Methods for Dirichlet Process Mixture Models , 2000 .

[20]  M. Sillanpää,et al.  Bayesian analysis of genetic differentiation between populations. , 2003, Genetics.

[21]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[22]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[23]  Geoffrey B. Nilsen,et al.  Whole-Genome Patterns of Common DNA Variation in Three Human Populations , 2005, Science.

[24]  C. Robert,et al.  Computational and Inferential Difficulties with Mixture Posterior Distributions , 2000 .