Order-independent constraint-based causal structure learning

We consider constraint-based methods for causal structure learning, such as the PC-, FCI-, RFCI- and CCD- algorithms (Spirtes et al., 1993, 2000; Richardson, 1996; Colombo et al., 2012; Claassen et al., 2013). The first step of all these algorithms consists of the adjacency search of the PC-algorithm. The PC-algorithm is known to be order-dependent, in the sense that the output can depend on the order in which the variables are given. This order-dependence is a minor issue in low-dimensional settings. We show, however, that it can be very pronounced in high-dimensional settings, where it can lead to highly variable results. We propose several modifications of the PC-algorithm (and hence also of the other algorithms) that remove part or all of this order-dependence. All proposed modifications are consistent in high-dimensional settings under the same conditions as their original counterparts. We compare the PC-, FCI-, and RFCI-algorithms and their modifications in simulation studies and on a yeast gene expression data set. We show that our modifications yield similar performance in low-dimensional settings and improved performance in high-dimensional settings. All software is implemented in the R-package pcalg.

[1]  D. Madigan,et al.  A characterization of Markov equivalence classes for acyclic digraphs , 1997 .

[2]  A. Cano,et al.  A Score Based Ranking of the Edges for the PC Algorithm , 2008 .

[3]  P. Spirtes,et al.  Ancestral graph Markov models , 2002 .

[4]  Marek J. Druzdzel,et al.  A Hybrid Anytime Algorithm for the Construction of Causal Models From Sparse Data , 1999, UAI.

[5]  Thomas S. Richardson,et al.  A Discovery Algorithm for Directed Cyclic Graphs , 1996, UAI.

[6]  Thomas S. Richardson,et al.  Learning high-dimensional directed acyclic graphs with latent and selection variables , 2011, 1104.5617.

[7]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[8]  P. Spirtes,et al.  MARKOV EQUIVALENCE FOR ANCESTRAL GRAPHS , 2009, 0908.3605.

[9]  Moninder Singh,et al.  An Algorithm for the Construction of Bayesian Network Structures from Data , 1993, UAI.

[10]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[11]  Naftali Harris,et al.  PC algorithm for Gaussian copula graphical models , 2012, 1207.0242.

[12]  Christopher Meek,et al.  Causal inference and causal explanation with background knowledge , 1995, UAI.

[13]  Peter Spirtes,et al.  An Anytime Algorithm for Causal Inference , 2001, AISTATS.

[14]  Peter Bühlmann,et al.  Predicting causal effects in large-scale systems from observational data , 2010, Nature Methods.

[15]  T. Heskes,et al.  Learning Sparse Causal Models is not NP-hard , 2013, UAI.

[16]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[17]  Dirk Thierens,et al.  A Skeleton-Based Approach to Learning Bayesian Networks from Data , 2003, PKDD.

[18]  Peter Bühlmann,et al.  Understanding human functioning using graphical models , 2010, BMC medical research methodology.

[19]  Xing-Ming Zhao,et al.  Inferring gene regulatory networks from gene expression data by PC-algorithm based on conditional mutual information , 2011 .

[20]  R. Nagarajan,et al.  Functional Relationships between Genes Associated with Differentiation Potential of Aged Myogenic Progenitors , 2010, Front. Physiology.

[21]  A. Dawid Conditional Independence for Statistical Operations , 1980 .

[22]  Xing-Ming Zhao,et al.  Inferring gene regulatory networks from gene expression data by path consistency algorithm based on conditional mutual information , 2012, Bioinform..

[23]  Christopher Meek,et al.  Learning Bayesian Networks with Discrete Variables from Data , 1995, KDD.

[24]  Peter Bühlmann,et al.  Causal Inference Using Graphical Models with the R Package pcalg , 2012 .

[25]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[26]  J. Pearl Causal inference in statistics: An overview , 2009 .

[27]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[28]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[29]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[30]  Jiji Zhang,et al.  On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias , 2008, Artif. Intell..

[31]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.

[32]  Jiji Zhang,et al.  Adjacency-Faithfulness and Conservative Causal Inference , 2006, UAI.

[33]  Peter Bühlmann,et al.  Causal stability ranking , 2011, Bioinform..

[34]  M. Maathuis,et al.  Estimating high-dimensional intervention effects from observational data , 2008, 0810.4214.

[35]  Peter Bühlmann,et al.  Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm , 2007, J. Mach. Learn. Res..

[36]  Thomas S. Richardson,et al.  Causal Inference in the Presence of Latent Variables and Selection Bias , 1995, UAI.

[37]  Naftali Harris,et al.  PC algorithm for nonparanormal graphical models , 2013, J. Mach. Learn. Res..