Learning bayesian networks from Markov random fields: An efficient algorithm for linear models

Dependency analysis is a typical approach for Bayesian network learning, which infers the structures of Bayesian networks by the results of a series of conditional independence (CI) tests. In practice, testing independence conditioning on large sets hampers the performance of dependency analysis algorithms in terms of accuracy and running time for the following reasons. First, testing independence on large sets of variables with limited samples is not stable. Second, for most dependency analysis algorithms, the number of CI tests grows at an exponential rate with the sizes of conditioning sets, and the running time grows of the same rate. Therefore, determining how to reduce the number of CI tests and the sizes of conditioning sets becomes a critical step in dependency analysis algorithms. In this article, we address a two-phase algorithm based on the observation that the structures of Markov random fields are similar to those of Bayesian networks. The first phase of the algorithm constructs a Markov random field from data, which provides a close approximation to the structure of the true Bayesian network; the second phase of the algorithm removes redundant edges according to CI tests to get the true Bayesian network. Both phases use Markov blanket information to reduce the sizes of conditioning sets and the number of CI tests without sacrificing accuracy. An empirical study shows that the two-phase algorithm performs well in terms of accuracy and efficiency.

[1]  Steffen L. Lauritzen,et al.  Independence properties of directed markov fields , 1990, Networks.

[2]  Doug Fisher,et al.  Learning from Data: Artificial Intelligence and Statistics V , 1996 .

[3]  Aapo Hyvärinen,et al.  Causal discovery of linear acyclic models with arbitrary distributions , 2008, UAI.

[4]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[5]  André Elisseeff,et al.  Using Markov Blankets for Causal Structure Learning , 2008, J. Mach. Learn. Res..

[6]  Yan Liu,et al.  Temporal causal modeling with graphical granger methods , 2007, KDD '07.

[7]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[8]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[9]  Mtw,et al.  Computation, causation, and discovery , 2000 .

[10]  D. Francis An introduction to structural equation models. , 1988, Journal of clinical and experimental neuropsychology.

[11]  Stuart J. Russell,et al.  Adaptive Probabilistic Networks with Hidden Variables , 1997, Machine Learning.

[12]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[13]  Stefan Szeider,et al.  Algorithms and Complexity Results for Exact Bayesian Structure Learning , 2010, UAI.

[14]  Tommi S. Jaakkola,et al.  Learning Bayesian Network Structure using LP Relaxations , 2010, AISTATS.

[15]  Aapo Hyvärinen,et al.  Causal modelling combining instantaneous and lagged effects: an identifiable model based on non-Gaussianity , 2008, ICML '08.

[16]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[17]  Lai-Wan Chan,et al.  An efficient causal discovery algorithm for linear models , 2010, KDD.

[18]  Kristian Kristensen,et al.  The use of a Bayesian network in the design of a decision support system for growing malting barley without use of pesticides , 2002 .

[19]  Paul D. Minton,et al.  Linear statistical models and related methods : with applications to social research , 1986 .

[20]  James Cussens,et al.  Bayesian network learning with cutting planes , 2011, UAI.

[21]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[22]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[23]  John Fox,et al.  Linear Statistical Models and Related Methods; With Applications to Social Research. , 1985 .

[24]  Dimitris Margaritis,et al.  Distribution-Free Learning of Bayesian Network Structure in Continuous Domains , 2005, AAAI.

[25]  Mikko Koivisto,et al.  Advances in Exact Bayesian Structure Discovery in Bayesian Networks , 2006, UAI.

[26]  S Andreassen,et al.  Evaluation of the diagnostic performance of the expert EMG assistant MUNIN. , 1996, Electroencephalography and clinical neurophysiology.

[27]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[28]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[29]  Mikko Koivisto,et al.  Exact Structure Discovery in Bayesian Networks with Less Space , 2009, UAI.

[30]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.

[31]  Jiawei Han,et al.  ACM Transactions on Knowledge Discovery from Data: Introduction , 2007 .

[32]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[33]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[34]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[35]  Charles E. Heckler,et al.  Applied Multivariate Statistical Analysis , 2005, Technometrics.

[36]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[37]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[38]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[39]  Judea Pearl,et al.  A Theory of Inferred Causation , 1991, KR.

[40]  Lai-Wan Chan,et al.  A Heuristic Partial-Correlation-Based Algorithm for Causal Relationship Discovery on Continuous Data , 2009, IDEAL.

[41]  P. Spirtes,et al.  From probability to causality , 1991 .

[42]  Sebastian Thrun,et al.  A Bayesian Multiresolution Independence Test for Continuous Variables , 2001, UAI.

[43]  Allan Leck Jensen,et al.  MIDAS: An Influence Diagram for Management of Mildew in Winter Wheat , 1996, UAI.

[44]  N. Wermuth,et al.  Graphical and recursive models for contingency tables , 1983 .

[45]  Anna Drewek,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2010 .

[46]  Alice M. Agogino,et al.  Automated Construction of Sparse Bayesian Networks from Unstructured Probabilistic Models and Domain Information , 2013, UAI.

[47]  Jiji Zhang,et al.  Adjacency-Faithfulness and Conservative Causal Inference , 2006, UAI.