Causal Effect Identification from Multiple Incomplete Data Sources: A General Search-based Approach

Causal effect identification considers whether an interventional probability distribution can be uniquely determined without parametric assumptions from measured source distributions and structural knowledge on the generating system. While complete graphical criteria and procedures exist for many identification problems, there are still challenging but important extensions that have not been considered in the literature. To tackle these new settings, we present a search algorithm directly over the rules of do-calculus. Due to generality of do-calculus, the search is capable of taking more advanced data-generating mechanisms into account along with an arbitrary type of both observational and experimental source distributions. The search is enhanced via a heuristic and search space reduction techniques. The approach, called do-search, is provably sound, and it is complete with respect to identifiability problems that have been shown to be completely characterized by do-calculus. When extended with additional rules, the search is capable of handling missing data problems as well. With the versatile search, we are able to approach new problems such as combined transportability and selection bias, or multiple sources of selection bias. We also perform a systematic analysis of bivariate missing data problems and study causal inference under case-control design.

[1]  J. Karvanen Study Design in Causal Models , 2012, 1211.2958.

[2]  Ioannis Tsamardinos,et al.  Constraint-based causal discovery from multiple interventions over overlapping variable sets , 2014, J. Mach. Learn. Res..

[3]  Elias Bareinboim,et al.  Transportability from Multiple Environments with Limited Experiments: Completeness Results , 2014, NIPS.

[4]  Jin Tian,et al.  Graphical Models for Inference with Missing Data , 2013, NIPS.

[5]  Peter Bühlmann,et al.  Causal Inference Using Graphical Models with the R Package pcalg , 2012 .

[6]  M. Maathuis,et al.  Estimating high-dimensional intervention effects from observational data , 2008, 0810.4214.

[7]  Judea Pearl,et al.  Identification of Joint Interventional Distributions in Recursive Semi-Markovian Causal Models , 2006, AAAI.

[8]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[9]  N. Breslow,et al.  Statistics in Epidemiology : The Case-Control Study , 2008 .

[10]  Judea Pearl,et al.  Missing Data as a Causal and Probabilistic Problem , 2015, UAI.

[11]  James M. Robins,et al.  Identification In Missing Data Models Represented By Directed Acyclic Graphs , 2019, UAI.

[12]  Elias Bareinboim,et al.  A General Algorithm for Deciding Transportability of Experimental Results , 2013, ArXiv.

[13]  Santtu Tikka,et al.  Enhancing Identification of Causal Effects by Pruning , 2018, J. Mach. Learn. Res..

[14]  José M. Peña,et al.  Causal effect identification in acyclic directed mixed graphs and gated models , 2016, Int. J. Approx. Reason..

[15]  Marco Valtorta,et al.  Identifiability in Causal Bayesian Networks: A Sound and Complete Algorithm , 2006, AAAI.

[16]  Elias Bareinboim,et al.  Controlling Selection Bias in Causal Inference , 2011, AISTATS.

[17]  Jin Tian,et al.  Recovering from Selection Bias in Causal and Statistical Inference , 2014, AAAI.

[18]  Santtu Tikka,et al.  Identifying Causal Effects with the R Package causaleffect , 2017, 1806.07161.

[19]  Dirk Eddelbuettel,et al.  Rcpp: Seamless R and C++ Integration , 2011 .

[20]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[21]  David Danks,et al.  Integrating Locally Learned Causal Structures with Overlapping Variables , 2008, NIPS.

[22]  Daniel Malinsky,et al.  Estimating bounds on causal effects in high-dimensional and possibly confounded systems , 2017, Int. J. Approx. Reason..

[23]  Maciej Liskiewicz,et al.  Robust causal inference using Directed Acyclic Graphs: the R package , 2018 .

[24]  P. Rosenbaum Identification of Causal Effects Using Instrumental Variables: Comment , 2007 .

[25]  Maciej Liskiewicz,et al.  Separators and Adjustment Sets in Causal Graphs: Complete Criteria and an Algorithmic Framework , 2018, Artif. Intell..

[26]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[27]  Maciej Liskiewicz,et al.  On Searching for Generalized Instrumental Variables , 2016, AISTATS.

[28]  Ioannis G. Tollis,et al.  Learning Causal Structure from Overlapping Variable Sets , 2010, AISTATS.

[29]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[30]  Patrik O. Hoyer,et al.  Data-driven covariate selection for nonparametric estimation of causal effects , 2013, AISTATS.

[31]  A. Dawid Influence Diagrams for Causal Modelling and Inference , 2002 .

[32]  Elias Bareinboim,et al.  Causal Inference by Surrogate Experiments: z-Identifiability , 2012, UAI.

[33]  Elias Bareinboim,et al.  General Identifiability with Arbitrary Surrogate Experiments , 2019, UAI.

[34]  Marco Valtorta,et al.  Pearl's Calculus of Intervention Is Complete , 2006, UAI.

[35]  Steffen L. Lauritzen,et al.  Causal Inference from Graphical Models , 2001 .

[36]  J. Pearl,et al.  Confounding and Collapsibility in Causal Inference , 1999 .

[37]  Judea Pearl,et al.  Complete Identification Methods for the Causal Hierarchy , 2008, J. Mach. Learn. Res..

[38]  Elias Bareinboim,et al.  Causal Effect Identification by Adjustment under Confounding and Selection Biases , 2017, AAAI.

[39]  Peter Spirtes,et al.  Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables , 2011, AISTATS.

[40]  Jin Tian,et al.  Recovering Causal Effects from Selection Bias , 2015, AAAI.

[41]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[42]  T. Richardson Markov Properties for Acyclic Directed Mixed Graphs , 2003 .

[43]  Frederick Eberhardt,et al.  Do-calculus when the True Graph Is Unknown , 2015, UAI.

[44]  Diego Colombo,et al.  A generalized backdoor criterion , 2013, ArXiv.

[45]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[46]  Jin Tian,et al.  Generalized Adjustment Under Confounding and Selection Biases , 2018, AAAI.

[47]  Santtu Tikka,et al.  Simplifying Probabilistic Expressions in Causal Inference , 2018, J. Mach. Learn. Res..

[48]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[49]  Judea Pearl,et al.  Identification of Conditional Interventional Distributions , 2006, UAI.

[50]  J. Pearl Causal diagrams for empirical research , 1995 .

[51]  Santtu Tikka,et al.  Surrogate Outcomes and Transportability , 2018, Int. J. Approx. Reason..

[52]  Johannes Textor,et al.  A Complete Generalized Adjustment Criterion , 2015, UAI.

[53]  Judea Pearl,et al.  Graphical Models for Processing Missing Data , 2018, Journal of the American Statistical Association.

[54]  Bernhard Schölkopf,et al.  Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..

[55]  Frederick Eberhardt,et al.  Causal Discovery of Linear Cyclic Models from Multiple Experimental Data Sets with Overlapping Variables , 2012, UAI.

[56]  Elias Bareinboim,et al.  Identification and Model Testing in Linear Structural Equation Models using Auxiliary Variables , 2016, ICML.

[57]  Jiji Zhang,et al.  Causal Identification under Markov Equivalence , 2018, UAI.