A Comparison of Novel and State-of-the-Art Polynomial Bayesian Network Learning Algorithms

Learning the most probable a posteriori Bayesian network from data has been shown to be an NP-Hard problem and typical state-of-the-art algorithms are exponential in the worst case. However, an important open problem in the field is to identify the least restrictive set of assumptions and corresponding algorithms under which learning the optimal network becomes polynomial. In this paper, we present a technique for learning the skeleton of a Bayesian network, called Polynomial Max-Min Skeleton (PMMS), and compare it with Three Phase Dependency Analysis, another state-ofthe-art polynomial algorithm. This analysis considers both the theoretical and empirical differences between the two algorithms, and demonstrates PMMS’s advantages in both respects. When extended with a greedy hill-climbing Bayesianscoring search to orient the edges, the novel algorithm proved more time efficient, scalable, and accurate in quality of reconstruction than most state-of-the-art Bayesian network learning algorithms. The results show promise of the existence of polynomial algorithms that are provably correct under minimal distributional assumptions.

[1]  Constantin F. Aliferis,et al.  An Algorithm For Generation of Large Bayesian Networks , 2003 .

[2]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[3]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[4]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[5]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[6]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[7]  Andrew W. Moore,et al.  Optimal Reinsertion: A New Search Operator for Accelerated and More Accurate Bayesian Network Structure Learning , 2003, ICML.

[8]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[9]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[10]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[11]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[12]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[13]  Christopher Meek,et al.  Monotone DAG Faithfulness: A Bad Assumption , 2003 .

[14]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[15]  Constantin F. Aliferis,et al.  A Novel Algorithm for Scalable and Accurate Bayesian Network Learning , 2004, MedInfo.