Bayesian Network Learning via Topological Order

We propose a mixed integer programming (MIP) model and iterative algorithms based on topological orders to solve optimization problems with acyclic constraints on a directed graph. The proposed MIP model has a significantly lower number of constraints compared to popular MIP models based on cycle elimination constraints and triangular inequalities. The proposed iterative algorithms use gradient descent and iterative reordering approaches, respectively, for searching topological orders. A computational experiment is presented for the Gaussian Bayesian network learning problem, an optimization problem minimizing the sum of squared errors of regression models with L1 penalty over a feature network with application of gene network inference in bioinformatics.

[1]  Prasad Raghavendra,et al.  Beating the Random Ordering is Hard: Inapproximability of Maximum Acyclic Subgraph , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[2]  C. Lucchesi,et al.  A Minimax Theorem for Directed Graphs , 1978 .

[3]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[4]  R. Kaas,et al.  A branch and bound algorithm for the acyclic subgraph problem , 1981 .

[5]  Vijaya Ramachandran,et al.  Finding a Minimum Feedback Arc Set in Reducible Flow Graphs , 1988, J. Algorithms.

[6]  Mikko Koivisto,et al.  Structure Discovery in Bayesian Networks by Sampling Partial Orders , 2016, J. Mach. Learn. Res..

[7]  Gerhard Reinelt,et al.  A Cutting Plane Algorithm for the Linear Ordering Problem , 1984, Oper. Res..

[8]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[9]  Ali Shojaie,et al.  Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. , 2009, Biometrika.

[10]  W. Wong,et al.  Learning Causal Bayesian Network Structures From Experimental Data , 2008 .

[11]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[12]  Nir Friedman,et al.  Being Bayesian About Network Structure. A Bayesian Approach to Structure Discovery in Bayesian Networks , 2004, Machine Learning.

[13]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[14]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[15]  Jon Lee,et al.  More facets from fences for linear ordering and acyclic subgraph polytopes , 1994, Discret. Appl. Math..

[16]  M. R. Rao,et al.  Combinatorial Optimization , 1992, NATO ASI Series.

[17]  Wai Lam,et al.  LEARNING BAYESIAN BELIEF NETWORKS: AN APPROACH BASED ON THE MDL PRINCIPLE , 1994, Comput. Intell..

[18]  Claire Mathieu,et al.  Electronic Colloquium on Computational Complexity, Report No. 144 (2006) How to rank with few errors A PTAS for Weighted Feedback Arc Set on Tournaments , 2006 .

[19]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[20]  J. Mitchell,et al.  Solving Linear Ordering Problems with a Combined Interior Point/Simplex Cutting Plane Algorithm , 2000 .

[21]  Caroline Uhler,et al.  Learning directed acyclic graphs based on sparsest permutations , 2013, ArXiv.

[22]  S. Geer,et al.  $\ell_0$-penalized maximum likelihood for sparse directed acyclic graphs , 2012, 1205.5473.

[23]  Eberhard Girlich,et al.  New Facets of the Linear Ordering Polytope , 1999, SIAM J. Discret. Math..

[24]  Refael Hassin,et al.  Approximations for the Maximum Acyclic Subgraph Problem , 1994, Inf. Process. Lett..

[25]  Myun-Seok Cheon,et al.  Estimation of Directed Acyclic Graphs Through Two-Stage Adaptive Lasso for Gene Network Inference , 2016, Journal of the American Statistical Association.

[26]  Michel X. Goemans,et al.  The Strongest Facets of the Acyclic Subgraph Polytope Are Unknown , 1996, IPCO.

[27]  Joseph Naor,et al.  Approximating Minimum Feedback Sets and Multicuts in Directed Graphs , 1998, Algorithmica.

[28]  Gerhard Reinelt,et al.  On the acyclic subgraph polytope , 1985, Math. Program..

[29]  Henning Fernau,et al.  Exact Algorithms for Maximum Acyclic Subgraph on a Superclass of Cubic Graphs , 2008, WALCOM.

[30]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[31]  Qing Zhou,et al.  Concave penalized estimation of sparse Gaussian Bayesian networks , 2014, J. Mach. Learn. Res..

[32]  Peter Bühlmann,et al.  Causal Inference Using Graphical Models with the R Package pcalg , 2012 .

[33]  Qing Zhou,et al.  Learning Sparse Causal Gaussian Networks With Experimental Intervention: Regularization and Coordinate Descent , 2013 .

[34]  Hermann Schichl,et al.  An Exact Method for the Minimum Feedback Arc Set Problem , 2021, ACM J. Exp. Algorithmics.

[35]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[36]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.