Joint Causal Inference from Multiple Contexts

The gold standard for discovering causal relations is by means of experimentation. Over the last decades, alternative methods have been proposed that can infer causal relations between variables from certain statistical patterns in purely observational data. We introduce Joint Causal Inference (JCI), a novel approach to causal discovery from multiple data sets from different contexts that elegantly unifies both approaches. JCI is a causal modeling framework rather than a specific algorithm, and it can be implemented using any causal discovery algorithm that can take into account certain background knowledge. JCI can deal with different types of interventions (e.g., perfect, imperfect, stochastic, etc.) in a unified fashion, and does not require knowledge of intervention targets or types in case of interventional data. We explain how several well-known causal discovery algorithms can be seen as addressing special cases of the JCI framework, and we also propose novel implementations that extend existing causal discovery methods for purely observational data to the JCI setting. We evaluate different JCI implementations on synthetic data and on flow cytometry protein expression data and conclude that JCI implementations can considerably outperform state-of-the-art causal discovery algorithms.

[1]  Radford M. Neal On Deducing Conditional Independence from d-Separation in Causal Graphs with Feedback (Research Note) , 2000, J. Artif. Intell. Res..

[2]  Caroline Uhler,et al.  Characterizing and Learning Equivalence Classes of Causal DAGs under Interventions , 2018, ICML.

[3]  Giorgos Borboudakis,et al.  Marginal Causal Consistency in Constraint-based Causal Learning , 2016, CFA@UAI.

[4]  Y. Kano,et al.  Causal Inference Using Nonnormality , 2004 .

[5]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[6]  Robert E. Tillman,et al.  Structure learning with independent non-identically distributed data , 2009, ICML '09.

[7]  Joris M. Mooij,et al.  Structural Causal Models: Cycles, Marginalizations, Exogenous Reparametrizations and Reductions , 2016, ArXiv.

[8]  P. Hoyer,et al.  On Causal Discovery from Time Series Data using FCI , 2010 .

[9]  Joris M. Mooij,et al.  Ancestral Causal Inference , 2016, NIPS.

[10]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[11]  Joris M. Mooij,et al.  Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias , 2019, UAI.

[12]  Peter Spirtes,et al.  Conditional Independence in Directed Cyclic Graphical Models for Feedback , 1994 .

[13]  J. Mooij,et al.  Markov Properties for Graphical Models with Cycles and Latent Variables , 2017, 1710.08775.

[14]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[15]  J. Koster On the Validity of the Markov Interpretation of Path Diagrams of Gaussian Structural Equations Systems with Correlated Errors , 1999 .

[16]  Joe W. Gray,et al.  Joint estimation of multiple networks from time course data , 2013 .

[17]  László Györfi,et al.  Strongly consistent nonparametric tests of conditional independence , 2012 .

[18]  J. I The Design of Experiments , 1936, Nature.

[19]  Frederick Eberhardt,et al.  Constraint-based Causal Discovery: Conflict Resolution with Answer Set Programming , 2014, UAI.

[20]  Christopher Meek,et al.  Strong completeness and faithfulness in Bayesian networks , 1995, UAI.

[21]  Rainer Spang,et al.  Probabilistic Soft Interventions in Conditional Gaussian Networks , 2005, AISTATS.

[22]  A. Dawid Conditional Independence in Statistical Theory , 1979 .

[23]  Jin Tian,et al.  Causal Discovery from Changes , 2001, UAI.

[24]  Bernhard Schölkopf,et al.  Causal Discovery from Nonstationary/Heterogeneous Data: Skeleton Estimation and Orientation Determination , 2017, IJCAI.

[25]  Jim Q. Smith,et al.  Exact estimation of multiple directed acyclic graphs , 2014, Stat. Comput..

[26]  Joris M. Mooij,et al.  Cyclic Causal Discovery from Continuous Equilibrium Data , 2013, UAI.

[27]  Christina Heinze-Deml,et al.  Invariant Causal Prediction for Nonlinear Models , 2017, Journal of Causal Inference.

[28]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[29]  A. Dawid,et al.  Extended Conditional Independence and Applications in Causal Inference , 2015, 1512.00245.

[30]  Jiji Zhang,et al.  Causal Inference and Reasoning in Causally Insu-cient Systems , 2006 .

[31]  Ioannis Tsamardinos,et al.  Constraint-based causal discovery from multiple interventions over overlapping variable sets , 2014, J. Mach. Learn. Res..

[32]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[33]  P. Spirtes,et al.  MARKOV EQUIVALENCE FOR ANCESTRAL GRAPHS , 2009, 0908.3605.

[34]  D. A. Kenny,et al.  Correlation and Causation , 1937, Wilmott.

[35]  P. Spirtes,et al.  Ancestral graph Markov models , 2002 .

[36]  Rina Dechter,et al.  Identifying Independencies in Causal Graphs with Feedback , 1996, UAI.

[37]  Vincenzo Lagani,et al.  Predicting Causal Relationships from Biological Data: Applying Automated Causal Discovery on Mass Cytometry Data of Human Immune Cells , 2017, Scientific Reports.

[38]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[39]  Kevin P. Murphy,et al.  Exact Bayesian structure learning from uncertain interventions , 2007, AISTATS.

[40]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[41]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[42]  Bernhard Schölkopf,et al.  Theoretical Aspects of Cyclic Structural Causal Models , 2016 .

[43]  Tom Claassen,et al.  Constraint-Based Causal Discovery In The Presence Of Cycles , 2020, ArXiv.

[44]  Tom Claassen,et al.  Learning Sparse Causal Models is not NP-hard , 2013, UAI.

[45]  Jonas Peters,et al.  BACKSHIFT: Learning causal cyclic graphs from unknown shift interventions , 2015, NIPS.

[46]  John D. Storey,et al.  Harnessing naturally randomized transcription to infer regulatory relationships among genes , 2007, Genome Biology.

[47]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[48]  Peter Spirtes,et al.  Directed Cyclic Graphical Representations of Feedback Models , 1995, UAI.

[49]  Frederick Eberhardt,et al.  Learning linear cyclic causal models with latent variables , 2012, J. Mach. Learn. Res..

[50]  Joris M. Mooij,et al.  Constraint-based Causal Discovery for Non-Linear Structural Causal Models with Cycles and Latent Confounders , 2018, UAI.

[51]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[52]  Judea Pearl,et al.  A Constraint-Propagation Approach to Probabilistic Reasoning , 1985, UAI.

[53]  Joris M. Mooij,et al.  Algebraic Equivalence of Linear Structural Equation Models , 2018, ArXiv.

[54]  Peter Bühlmann,et al.  Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm , 2007, J. Mach. Learn. Res..

[55]  Thomas S. Richardson,et al.  Causal Inference in the Presence of Latent Variables and Selection Bias , 1995, UAI.

[56]  Elias Bareinboim,et al.  A General Algorithm for Deciding Transportability of Experimental Results , 2013, ArXiv.

[57]  James M. Robins,et al.  INTRODUCTION TO NESTED MARKOV MODELS , 2014 .

[58]  A. Dawid Influence Diagrams for Causal Modelling and Inference , 2002 .

[59]  Patrik O. Hoyer,et al.  Data-driven covariate selection for nonparametric estimation of causal effects , 2013, AISTATS.

[60]  Tom Heskes,et al.  Causal discovery in multiple models from different experiments , 2010, NIPS.

[61]  Mehdi M. Kashani,et al.  Large-Scale Genetic Perturbations Reveal Regulatory Networks and an Abundance of Gene-Specific Repressors , 2014, Cell.

[62]  Gregory F. Cooper,et al.  Causal Discovery from a Mixture of Experimental and Observational Data , 1999, UAI.

[63]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[64]  Joris M. Mooij,et al.  Domain Adaptation by Using Causal Inference to Predict Invariant Conditional Distributions , 2017, NeurIPS.

[65]  Gregory F. Cooper,et al.  A bayesian local causal discovery framework , 2005 .

[66]  Joris M. Mooij,et al.  An Upper Bound for Random Measurement Error in Causal Discovery , 2018, ArXiv.

[67]  T. Richardson Markov Properties for Acyclic Directed Mixed Graphs , 2003 .

[68]  Jiji Zhang,et al.  Causal Reasoning with Ancestral Graphs , 2008, J. Mach. Learn. Res..

[69]  Gregory F. Cooper,et al.  A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships , 1997, Data Mining and Knowledge Discovery.

[70]  Joseph Ramsey,et al.  FASK with Interventional Knowledge Recovers Edges from the Sachs Model , 2018, ArXiv.

[71]  Peter Spirtes,et al.  Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables , 2011, AISTATS.

[72]  Peter Bühlmann,et al.  Causal Inference Using Graphical Models with the R Package pcalg , 2012 .

[73]  Joris M. Mooij,et al.  Boosting Local Causal Discovery in High-Dimensional Expression Data , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[74]  Diego Colombo,et al.  Order-independent constraint-based causal structure learning , 2012, J. Mach. Learn. Res..

[75]  Eric V. Strobl A constraint-based algorithm for causal discovery with cycles, latent variables and selection bias , 2018, International Journal of Data Science and Analytics.

[76]  D. A. Kenny,et al.  Correlation and Causation. , 1982 .

[77]  Jiji Zhang,et al.  On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias , 2008, Artif. Intell..

[78]  Peter Bühlmann,et al.  Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs (Abstract) , 2011, UAI.

[79]  Jim Q. Smith,et al.  Estimating Causal Structure Using Conditional DAG Models , 2014, J. Mach. Learn. Res..

[80]  Jonas Peters,et al.  Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.

[81]  Dan Geiger,et al.  Identifying independence in bayesian networks , 1990, Networks.

[82]  J. Koster,et al.  Markov properties of nonrecursive causal models , 1996 .

[83]  A. Philip Dawid,et al.  Direct and Indirect Effects of Sequential Treatments , 2006, UAI.

[84]  Bernhard Schölkopf,et al.  Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks , 2014, J. Mach. Learn. Res..

[85]  Bernhard Schölkopf,et al.  Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..

[86]  N. Meinshausen,et al.  Methods for causal inference from gene perturbation experiments and validation , 2016, Proceedings of the National Academy of Sciences.

[87]  Tom Heskes,et al.  A Logical Characterization of Constraint-Based Causal Discovery , 2011, UAI.

[88]  P. Spirtes,et al.  Using Path Diagrams as a Structural Equation Modeling Tool , 1998 .

[89]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[90]  Judea Pearl,et al.  Comment: Graphical Models, Causality and Intervention , 2016 .