Bayesian Learning in Bayesian Networks of Moderate Size by Efficient Sampling

We study the Bayesian model averaging approach to learning Bayesian network structures (DAGs) from data. We develop new algorithms including the first algorithm that is able to efficiently sample DAGs according to the exact structure posterior. The DAG samples can then be used to construct the estimators for the posterior of any feature. Our estimators have several good properties; for example, unlike the existing MCMC-based algorithms, quality guarantee can be provided for our estimators when assuming the order-modular prior. We empirically show that our algorithms considerably outperform previous state-of-the-art methods.

[1]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[2]  Mikko Koivisto,et al.  Ancestor Relations in the Presence of Unobserved Variables , 2011, ECML/PKDD.

[3]  K. Athreya,et al.  Measure Theory and Probability Theory , 2006 .

[4]  W. Wong,et al.  Learning Causal Bayesian Network Structures From Experimental Data , 2008 .

[5]  Tommi S. Jaakkola,et al.  Tractable Bayesian learning of tree belief networks , 2000, Stat. Comput..

[6]  Marco Grzegorczyk,et al.  Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move , 2008, Machine Learning.

[7]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[8]  Gregory F. Cooper,et al.  Model Averaging for Prediction with Discrete Bayesian Networks , 2004, J. Mach. Learn. Res..

[9]  Tomi Silander,et al.  A Simple Approach for Finding the Globally Optimal Bayesian Network Structure , 2006, UAI.

[10]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[11]  Jin Tian,et al.  Bayesian model averaging using the k -best Bayesian network structures , 2010, UAI 2010.

[12]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[13]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[14]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[15]  Kevin P. Murphy,et al.  Bayesian structure learning using dynamic programming and MCMC , 2007, UAI.

[16]  Peter Winkler,et al.  Counting linear extensions is #P-complete , 1991, STOC '91.

[17]  Nir Friedman,et al.  Data Analysis with Bayesian Networks: A Bootstrap Approach , 1999, UAI.

[18]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[19]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[20]  Mikko Koivisto,et al.  Partial Order MCMC for Structure Discovery in Bayesian Networks , 2011, UAI.

[21]  Mikko Koivisto,et al.  Advances in Exact Bayesian Structure Discovery in Bayesian Networks , 2006, UAI.

[22]  D. Heckerman,et al.  A Bayesian Approach to Causal Discovery , 2006 .

[23]  D. Edwards Introduction to graphical modelling , 1995 .

[24]  Jin Tian,et al.  Computing Posterior Probabilities of Structural Features in Bayesian Networks , 2009, UAI.

[25]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.