Scalable Intervention Target Estimation in Linear Models

This paper considers the problem of estimating the unknown intervention targets in a causal directed acyclic graph from observational and interventional data. The focus is on soft interventions in linear structural equation models (SEMs). Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets even for linear SEMs. This severely limits their scalability and sample complexity. This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets. The pivotal idea is to estimate the intervention sites from the difference between the precision matrices associated with the observational and interventional datasets. It involves repeatedly estimating such sites in different subsets of variables. The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class. Consistency, Markov equivalency, and sample complexity are established analytically. Finally, simulation results on both real and synthetic data demonstrate the gains of the proposed approach for scalable causal structure recovery. Implementation of the algorithm and the code to reproduce the simulation results are available at https://github.com/bvarici/intervention-estimation.

[1]  Xiangyu Wang,et al.  A Direct Approach for Sparse Quadratic Discriminant Analysis , 2015, J. Mach. Learn. Res..

[3]  Peter Bühlmann,et al.  Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs (Abstract) , 2011, UAI.

[4]  Kevin P. Murphy,et al.  Exact Bayesian structure learning from uncertain interventions , 2007, AISTATS.

[5]  Thomas M. Norman,et al.  Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens , 2016, Cell.

[6]  Yuhao Wang,et al.  Permutation-based Causal Inference Algorithms with Interventions , 2017, NIPS.

[7]  Jean Honorio,et al.  Learning linear structural equation models in polynomial time and sample complexity , 2017, AISTATS.

[8]  Olga Vitek,et al.  A Bayesian Active Learning Experimental Design for Inferring Signaling Networks , 2017, RECOMB.

[9]  Chandler Squires,et al.  Permutation-Based Causal Structure Learning with Unknown Intervention Targets , 2020, UAI.

[10]  Yuhao Wang,et al.  Direct Estimation of Differences in Causal Graphs , 2018, NeurIPS.

[11]  Joaquin Quiñonero Candela,et al.  Counterfactual reasoning and learning systems: the example of computational advertising , 2013, J. Mach. Learn. Res..

[12]  R. Tothill,et al.  Novel Molecular Subtypes of Serous and Endometrioid Ovarian Cancer Linked to Clinical Outcome , 2008, Clinical Cancer Research.

[13]  Nan Rosemary Ke,et al.  Learning Neural Causal Models from Unknown Interventions , 2019, ArXiv.

[14]  Cheng Wang,et al.  A fast iterative algorithm for high-dimensional differential network , 2019, Comput. Stat..

[15]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[16]  M. Pourahmadi Covariance Estimation: The GLM and Regularization Perspectives , 2011, 1202.1661.

[17]  R. Scheines,et al.  Interventions and Causal Inference , 2007, Philosophy of Science.

[18]  F. L. D. Silva,et al.  EEG signal processing , 2000, Clinical Neurophysiology.

[19]  Ruibin Xi,et al.  Differential network analysis via lasso penalized D-trace loss , 2015, 1511.09188.

[20]  Judea Pearl,et al.  An Algorithm for Deciding if a Set of Observed Independencies Has a Causal Explanation , 1992, UAI.

[21]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[22]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[23]  J. Keith Joung,et al.  High frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells , 2013, Nature Biotechnology.

[24]  Alexandre Lacoste,et al.  Differentiable Causal Discovery from Interventional Data , 2020, NeurIPS.

[25]  T. Cai,et al.  Direct estimation of differential networks. , 2014, Biometrika.

[26]  Murat Kocaoglu,et al.  Causal Discovery from Soft Interventions with Unknown Targets: Characterization and Learning , 2020, NeurIPS.