Constraint-based Causal Structure Learning with Consistent Separating Sets

We consider constraint-based methods for causal structure learning, such as the PC algorithm or any PC-derived algorithms whose first step consists in pruning a complete graph to obtain an undirected graph skeleton, which is subsequently oriented. All constraint-based methods perform this first step of removing dispensable edges, iteratively, whenever a separating set and corresponding conditional independence can be found. Yet, constraint-based methods lack robustness over sampling noise and are prone to uncover spurious conditional independences in finite datasets. In particular, there is no guarantee that the separating sets identified during the iterative pruning step remain consistent with the final graph. In this paper, we propose a simple modification of PC and PC-derived algorithms so as to ensure that all separating sets identified to remove dispensable edges are consistent with the final graph,thus enhancing the explainability of constraint-basedmethods. It is achieved by repeating the constraint-based causal structure learning scheme, iteratively, while searching for separating sets that are consistent with the graph obtained at the previous iteration. Ensuring the consistency of separating sets can be done at a limited complexity cost, through the use of block-cut tree decomposition of graph skeletons, and is found to increase their validity in terms of actual d-separation. It also significantly improves the sensitivity of constraint-based methods while retaining good overall structure learning performance. Finally and foremost, ensuring sepset consistency improves the interpretability of constraint-based models for real-life applications.

[1]  Peter Bühlmann,et al.  Causal Inference Using Graphical Models with the R Package pcalg , 2012 .

[2]  P. Spirtes,et al.  Ancestral graph Markov models , 2002 .

[3]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[4]  Jiji Zhang,et al.  On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias , 2008, Artif. Intell..

[5]  Thomas S. Richardson,et al.  Learning high-dimensional directed acyclic graphs with latent and selection variables , 2011, 1104.5617.

[6]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[7]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[8]  Peter Bühlmann,et al.  Robustification of the PC-Algorithm for Directed Acyclic Graphs , 2008 .

[9]  Diego Colombo,et al.  Order-independent constraint-based causal structure learning , 2012, J. Mach. Learn. Res..

[10]  Frederick Eberhardt,et al.  Discovering Cyclic Causal Models with Latent Variables: A General SAT-Based Procedure , 2013, UAI.

[11]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[12]  R Scheines,et al.  The TETRAD Project: Constraint Based Aids to Causal Model Specification. , 1998, Multivariate behavioral research.

[13]  Hervé Isambert,et al.  Robust reconstruction of causal graphical models based on conditional 2-point and 3-point information , 2015, UAI.

[14]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[15]  Nadir Sella,et al.  Learning causal networks with latent variables from multivariate information in genomic data , 2017, PLoS Comput. Biol..

[16]  Hervé Isambert,et al.  3off2: A network reconstruction algorithm based on 2-point and 3-point information statistics , 2016, BMC Bioinformatics.

[17]  Judea Pearl,et al.  A Theory of Inferred Causation , 1991, KR.

[18]  Nadir Sella,et al.  MIIC online: a web server to reconstruct causal or non‐causal networks from non‐perturbative data , 2018, Bioinform..

[19]  Thomas S. Richardson,et al.  Causal Inference in the Presence of Latent Variables and Selection Bias , 1995, UAI.

[20]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.