Learning an L1-Regularized Gaussian Bayesian Network in the Equivalence Class Space

Learning the structure of a graphical model from data is a common task in a wide range of practical applications. In this paper, we focus on Gaussian Bayesian networks, i.e., on continuous data and directed acyclic graphs with a joint probability density of all variables given by a Gaussian. We propose to work in an equivalence class search space, specifically using the k-greedy equivalence search algorithm. This, combined with regularization techniques to guide the structure search, can learn sparse networks close to the one that generated the data. We provide results on some synthetic networks and on modeling the gene network of the two biological pathways regulating the biosynthesis of isoprenoids for the Arabidopsis thaliana plant.

[1]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[2]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[3]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[4]  C. Meek,et al.  Graphical models: selecting causal and statistical models , 1997 .

[5]  Milan Studený,et al.  On characterizing Inclusion of Bayesian Networks , 2001, UAI.

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2003, J. Mach. Learn. Res..

[8]  Mark W. Schmidt,et al.  Learning Graphical Model Structure Using L1-Regularization Paths , 2007, AAAI.

[9]  David Maxwell Chickering,et al.  Finding Optimal Bayesian Networks , 2002, UAI.

[10]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[11]  Michael I. Jordan,et al.  Learning Graphical Models with Mercer Kernels , 2002, NIPS.

[12]  Runze Li,et al.  Statistical Challenges with High Dimensionality: Feature Selection in Knowledge Discovery , 2006, math/0602133.

[13]  Carlos Cotta,et al.  A Study on the Evolution of Bayesian Network Graph Structures , 2007 .

[14]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[15]  D. M. Gaba,et al.  The ALARM Monitoring System – Intelligent Decision Making under Uncertainty , 1989 .

[16]  Nir Friedman,et al.  Inferring Cellular Networks Using Probabilistic Graphical Models , 2004, Science.

[17]  Satoru Miyano,et al.  Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[18]  Michael D. Perlman,et al.  Enumerating Markov Equivalence Classes of Acyclic Digraph Models , 2001, UAI.

[19]  Enrique F. Castillo,et al.  Expert Systems and Probabilistic Network Models , 1996, Monographs in Computer Science.

[20]  Stuart J. Russell,et al.  Adaptive Probabilistic Networks with Hidden Variables , 1997, Machine Learning.

[21]  Yiming Yang,et al.  Using Modified Lasso Regression to Learn Large Undirected Graphs in a Probabilistic Framework , 2005, AAAI.

[22]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[23]  Eric Beattie,et al.  Alarm monitoring system , 2001 .

[24]  Andrew W. Moore,et al.  Fast factored density estimation and compression with bayesian networks , 2002 .

[25]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[26]  José M. Peña,et al.  On Local Optima in Learning Bayesian Networks , 2003, UAI.

[27]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[28]  Dimitris Margaritis,et al.  Distribution-Free Learning of Bayesian Network Structure in Continuous Domains , 2005, AAAI.

[29]  Joe Whittaker,et al.  Edge Exclusion Tests for Graphical Gaussian Models , 1999, Learning in Graphical Models.

[30]  José Manuel Gutiérrez,et al.  Expert Systems and Probabiistic Network Models , 1996 .

[31]  S. Gillispie Formulas for counting acyclic digraph Markov equivalence classes , 2006 .

[32]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[33]  N. Meinshausen,et al.  Consistent neighbourhood selection for sparse high-dimensional graphs with the Lasso , 2004 .

[34]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[35]  T. Hesterberg,et al.  Least angle and ℓ1 penalized regression: A review , 2008, 0802.0964.

[36]  David Heckerman,et al.  Learning Gaussian Networks , 1994, UAI.

[37]  Robert Castelo,et al.  On Inclusion-Driven Learning of Bayesian Networks , 2003, J. Mach. Learn. Res..

[38]  P. Bühlmann,et al.  Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana , 2004, Genome Biology.

[39]  Hongzhe Li,et al.  Gradient directed regularization for sparse Gaussian concentration graphs, with applications to inference of genetic networks. , 2006, Biostatistics.