Efficient Intervention Design for Causal Discovery with Latents

We consider recovering a causal graph in presence of latent variables, where we seek to minimize the cost of interventions used in the recovery process. We consider two intervention cost models: (1) a linear cost model where the cost of an intervention on a subset of variables has a linear form, and (2) an identity cost model where the cost of an intervention is the same, regardless of what variables it is on, i.e., the goal is just to minimize the number of interventions. Under the linear cost model, we give an algorithm to identify the ancestral relations of the underlying causal graph, achieving within a $2$-factor of the optimal intervention cost. This approximation factor can be improved to $1+\epsilon$ for any $\epsilon > 0$ under some mild restrictions. Under the identity cost model, we bound the number of interventions needed to recover the entire causal graph, including the latent variables, using a parameterization of the causal graph through a special type of colliders. In particular, we introduce the notion of $p$-colliders, that are colliders between pair of nodes arising from a specific type of conditioning in the causal graph, and provide an upper bound on the number of interventions as a function of the maximum number of $p$-colliders between any two nodes in the causal graph.

[1]  R. Scheines,et al.  Interventions and Causal Inference , 2007, Philosophy of Science.

[2]  Daniel M. Kane,et al.  Testing Conditional Independence of Discrete Distributions , 2017, 2018 Information Theory and Applications Workshop (ITA).

[3]  Ilias Diakonikolas,et al.  Optimal Algorithms for Testing Closeness of Discrete Distributions , 2013, SODA.

[4]  Cai Mao-cheng,et al.  On separating systems of graphs , 1984 .

[5]  Elias Bareinboim,et al.  Causal inference and the data-fusion problem , 2016, Proceedings of the National Academy of Sciences.

[6]  Mikko Koivisto,et al.  Ancestor Relations in the Presence of Unobserved Variables , 2011, ECML/PKDD.

[7]  Po-Ling Loh,et al.  High-dimensional learning of linear causal networks via inverse covariance estimation , 2013, J. Mach. Learn. Res..

[8]  Ákos Kisvölcsey Flattening Antichains , 2006, Comb..

[9]  Peter Bühlmann,et al.  Two optimal strategies for active learning of causal models from interventional data , 2012, Int. J. Approx. Reason..

[10]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[11]  Elias Bareinboim,et al.  Local Characterizations of Causal Bayesian Networks , 2011, GKR.

[12]  Alexandros G. Dimakis,et al.  Cost-Optimal Learning of Causal Graphs , 2017, ICML.

[13]  Yangbo He,et al.  Active Learning of Causal Networks with Intervention Experiments and Optimal Designs , 2008 .

[14]  Richard Scheines,et al.  Learning the Structure of Linear Latent Variable Models , 2006, J. Mach. Learn. Res..

[15]  Lada A. Adamic,et al.  Power-Law Distribution of the World Wide Web , 2000, Science.

[16]  Judea Pearl,et al.  An Algorithm for Deciding if a Set of Observed Independencies Has a Causal Explanation , 1992, UAI.

[17]  P. Spirtes,et al.  Ancestral graph Markov models , 2002 .

[18]  Frederick Eberhardt,et al.  Discovering Cyclic Causal Models with Latent Variables: A General SAT-Based Procedure , 2013, UAI.

[19]  Peter Bühlmann,et al.  Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs (Abstract) , 2011, UAI.

[20]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[21]  Christina Heinze-Deml,et al.  Causal Structure Learning , 2017, 1706.09141.

[22]  Stasys Jukna,et al.  Extremal Combinatorics - With Applications in Computer Science , 2001, Texts in Theoretical Computer Science. An EATCS Series.

[23]  Judea Pearl,et al.  Identification of Joint Interventional Distributions in Recursive Semi-Markovian Causal Models , 2006, AAAI.

[24]  O. Antoine,et al.  Theory of Error-correcting Codes , 2022 .

[25]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[26]  F. MacWilliams,et al.  The Theory of Error-Correcting Codes , 1977 .

[27]  Karthikeyan Shanmugam,et al.  Experimental Design for Learning Causal Graphs with Latent Variables , 2017, NIPS.

[28]  Alexandros G. Dimakis,et al.  Learning Causal Graphs with Small Interventions , 2015, NIPS.

[29]  Frederick Eberhardt,et al.  Experiment selection for causal discovery , 2013, J. Mach. Learn. Res..

[30]  Alexandros G. Dimakis,et al.  Experimental Design for Cost-Aware Learning of Causal Graphs , 2018, NeurIPS.

[31]  Jin Tian,et al.  A general identification condition for causal effects , 2002, AAAI/IAAI.

[32]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[33]  Mathias Frisch,et al.  Causation and intervention , 2014 .

[34]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[35]  G. Katona On separating systems of a finite set , 1966 .