Learning Bounded Tree-width Bayesian Networks using Integer Linear Programming

In many applications one wants to compute conditional probabilities given a Bayesian network. This inference problem is NP-hard in general but becomes tractable when the network has low tree-width. Since the inference problem is common in many application areas, we provide a practical algorithm for learning bounded tree-width Bayesian networks. We cast this problem as an integer linear program (ILP). The program can be solved by an anytime algorithm which provides upper bounds to assess the quality of the found solutions. A key component of our program is a novel integer linear formulation for bounding tree-width of a graph. Our tests clearly indicate that our approach works in practice, as our implementation was able to nd an optimal or nearly optimal network for most of the data sets.

[1]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[3]  Xuemin Lin,et al.  A Fast and Effective Heuristic for the Feedback Arc Set Problem , 1993, Inf. Process. Lett..

[4]  Michael Luby,et al.  Approximating Probabilistic Inference in Bayesian Belief Networks is NP-Hard , 1993, Artif. Intell..

[5]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[6]  Michael I. Jordan,et al.  Thin Junction Trees , 2001, NIPS.

[7]  Nathan Srebro,et al.  Maximum likelihood bounded tree-width Markov networks , 2001, Artif. Intell..

[8]  David R. Karger,et al.  Learning Markov networks: maximum bounded tree-width graphs , 2001, SODA '01.

[9]  Alina Beygelzimer,et al.  Approximability of Probability Distributions , 2003, NIPS.

[10]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[11]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[12]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[13]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[14]  Tomi Silander,et al.  A Simple Approach for Finding the Globally Optimal Bayesian Network Structure , 2006, UAI.

[15]  Carlos Guestrin,et al.  Efficient Principled Learning of Thin Junction Trees , 2007, NIPS.

[16]  Stephen Gould,et al.  Learning Bounded Treewidth Bayesian Networks , 2008, NIPS.

[17]  Venkat Chandrasekaran,et al.  Complexity of Inference in Graphical Models , 2008, UAI.

[18]  Qiang Ji,et al.  Structure learning of Bayesian networks using constraints , 2009, ICML '09.

[19]  Mikko Koivisto,et al.  Exact Structure Discovery in Bayesian Networks with Less Space , 2009, UAI.

[20]  Tommi S. Jaakkola,et al.  Learning Bayesian Network Structure using LP Relaxations , 2010, AISTATS.

[21]  Johan Kwisthout,et al.  The Necessity of Bounded Treewidth for Efficient Inference in Bayesian Networks , 2010, ECAI.

[22]  Alexander Grigoriev,et al.  Integer linear programming formulations for treewidth , 2011 .

[23]  Qiang Ji,et al.  Efficient Structure Learning of Bayesian Networks using Constraints , 2011, J. Mach. Learn. Res..

[24]  James Cussens,et al.  Bayesian network learning with cutting planes , 2011, UAI.

[25]  Changhe Yuan,et al.  Improving the Scalability of Optimal Bayesian Network Learning with External-Memory Frontier Breadth-First Branch and Bound Search , 2011, UAI.

[26]  James Cussens,et al.  Advances in Bayesian Network Learning using Integer Programming , 2013, UAI.

[27]  Janne H. Korhonen,et al.  Exact Learning of Bounded Tree-width Bayesian Networks , 2013, AISTATS.

[28]  Francis R. Bach,et al.  Convex Relaxations for Learning Bounded-Treewidth Decomposable Graphs , 2012, ICML.