New skeleton-based approaches for Bayesian structure learning of Bayesian networks

Automatically learning the graph structure of a single Bayesian network (BN) which accurately represents the underlying multivariate probability distribution of a collection of random variables is a challenging task. But obtaining a Bayesian solution to this problem based on computing the posterior probability of the presence of any edge or any directed path between two variables or any other structural feature is a much more involved problem, since it requires averaging over all the possible graph structures. For the former problem, recent advances have shown that search+score approaches find much more accurate structures if the search is constrained by a previously inferred skeleton (i.e. a relaxed structure with undirected edges which can be inferred using local search based methods). Based on similar ideas, we propose two novel skeleton-based approaches to approximate a Bayesian solution to the BN learning problem: a new stochastic search which tries to find directed acyclic graph (DAG) structures with a non-negligible score; and a new Markov chain Monte Carlo method over the DAG space. These two approaches are based on the same idea. In a first step, both employ a previously given skeleton and build a Bayesian solution constrained by this skeleton. In a second step, using the preliminary solution, they try to obtain a new Bayesian approximation but this time in an unconstrained graph space, which is the final outcome of the methods. As shown in the experimental evaluation, this new approach strongly boosts the performance of these two standard techniques proving that the idea of employing a skeleton to constrain the model space is also a successful strategy for performing Bayesian structure learning of BNs.

[1]  Marco Grzegorczyk,et al.  Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move , 2008, Machine Learning.

[2]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[3]  James G. Scott,et al.  Objective Bayesian model selection in Gaussian graphical models , 2009 .

[4]  Zhi Geng,et al.  A Recursive Method for Structural Learning of Directed Acyclic Graphs , 2008, J. Mach. Learn. Res..

[5]  Andrés R. Masegosa,et al.  A Bayesian stochastic search method for discovering Markov boundaries , 2012, Knowl. Based Syst..

[6]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[7]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[8]  S. Miyano,et al.  Finding Optimal Bayesian Network Given a Super-Structure , 2008 .

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  Peter Secretan Learning , 1965, Mental Health.

[11]  James G. Scott,et al.  Feature-Inclusion Stochastic Search for Gaussian Graphical Models , 2008 .

[12]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[13]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[14]  James G. Scott,et al.  An exploration of aspects of Bayesian multiple testing , 2006 .

[15]  Michael A. West,et al.  Archival Version including Appendicies : Experiments in Stochastic Computation for High-Dimensional Graphical Models , 2005 .

[16]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[17]  Dirk Thierens,et al.  A Skeleton-Based Approach to Learning Bayesian Networks from Data , 2003, PKDD.

[18]  André Elisseeff,et al.  Using Markov Blankets for Causal Structure Learning , 2008, J. Mach. Learn. Res..

[19]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[20]  Jose Miguel Puerta,et al.  Constrained Score+(Local)Search Methods for Learning Bayesian Networks , 2005, ECSQARU.

[21]  Olivier Pourret,et al.  Bayesian networks : a practical guide to applications , 2008 .

[22]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[23]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[24]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[25]  Kevin P. Murphy,et al.  Bayesian structure learning using dynamic programming and MCMC , 2007, UAI.

[26]  Mikko Koivisto,et al.  Partial Order MCMC for Structure Discovery in Bayesian Networks , 2011, UAI.

[27]  Joaquín Abellán,et al.  Some Variations on the PC Algorithm , 2006, Probabilistic Graphical Models.

[28]  Jose Miguel Puerta,et al.  Learning Bayesian networks by hill climbing: efficient methods based on progressive restriction of the neighborhood , 2010, Data Mining and Knowledge Discovery.

[29]  Søren Holbech Nielsen,et al.  Proceedings of the Second European Workshop on Probabilistic Graphical Models , 2004 .

[30]  M. West,et al.  Shotgun Stochastic Search for “Large p” Regression , 2007 .

[31]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[32]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[33]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.