Permutation-based Causal Inference Algorithms with Interventions

Learning directed acyclic graphs using both observational and interventional data is now a fundamentally important problem due to recent technological developments in genomics that generate such single-cell gene expression data at a very large scale. In order to utilize this data for learning gene regulatory networks, efficient and reliable causal inference algorithms are needed that can make use of both observational and interventional data. In this paper, we present two algorithms of this type and prove that both are consistent under the faithfulness assumption. These algorithms are interventional adaptations of the Greedy SP algorithm and are the first algorithms using both observational and interventional data with consistency guarantees. Moreover, these algorithms have the advantage that they are nonparametric, which makes them useful also for analyzing non-Gaussian data. In this paper, we present these two algorithms and their consistency guarantees, and we analyze their performance on simulated data, protein signaling data, and single-cell gene expression data.

[1]  Evan Z. Macosko,et al.  Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets , 2015, Cell.

[2]  C. Meek,et al.  Graphical models: selecting causal and statistical models , 1997 .

[3]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[4]  Thomas M. Norman,et al.  Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens , 2016, Cell.

[5]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[6]  Alain Hauser,et al.  Jointly interventional and observational data: estimation of interventional Markov equivalence classes of directed acyclic graphs , 2013, 1303.3216.

[7]  Caroline Uhler,et al.  Consistency Guarantees for Permutation-Based Causal Inference Algorithms , 2017 .

[8]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[9]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[10]  Ioannis Tsamardinos,et al.  Constraint-based causal discovery from multiple interventions over overlapping variable sets , 2014, J. Mach. Learn. Res..

[11]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[12]  Peter Bühlmann,et al.  Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs (Abstract) , 2011, UAI.

[13]  Joris M. Mooij,et al.  Ancestral Causal Inference , 2016, NIPS.

[14]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[15]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[16]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[17]  Frederick Eberhardt,et al.  Constraint-based Causal Discovery: Conflict Resolution with Answer Set Programming , 2014, UAI.

[18]  Nir Friedman,et al.  A high-throughput chromatin immunoprecipitation approach reveals principles of dynamic gene regulation in mammals. , 2012, Molecular cell.

[19]  Arthur Gretton,et al.  Nonlinear directed acyclic structure learning with weakly additive noise models , 2009, NIPS.

[20]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[21]  N. Meinshausen,et al.  Methods for causal inference from gene perturbation experiments and validation , 2016, Proceedings of the National Academy of Sciences.

[22]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[23]  Alain Hauser,et al.  High-dimensional consistency in score-based and hybrid structure learning , 2015, The Annals of Statistics.

[24]  D. Madigan,et al.  A characterization of Markov equivalence classes for acyclic digraphs , 1997 .

[25]  Grégory Nuel,et al.  Joint estimation of causal effects from observational and intervention gene expression data , 2013, BMC Systems Biology.

[26]  Michael I. Jordan Graphical Models , 2003 .