Causal Dantzig: Fast inference in linear structural equation models with hidden variables under additive interventions

Causal inference is known to be very challenging when only observational data are available. Randomized experiments are often costly and impractical and in instrumental variable regression the number of instruments has to exceed the number of causal predictors. It was recently shown in Peters et al. [2016] that causal inference for the full model is possible when data from distinct observational environments are available, exploiting that the conditional distribution of a response variable is invariant under the correct causal model. Two shortcomings of such an approach are the high computational effort for large-scale data and the assumed absence of hidden confounders. Here we show that these two shortcomings can be addressed if one is willing to make a more restrictive assumption on the type of interventions that generate different environments. Thereby, we look at a different notion of invariance, namely inner-product invariance. By avoiding a computationally cumbersome reverse-engineering approach such as in Peters et al. [2016], it allows for large-scale causal inference in linear structural equation models. We discuss identifiability conditions for the causal parameter and derive asymptotic confidence intervals in the low-dimensional setting. In the case of non-identifiability we show that the solution set of causal Dantzig has predictive guarantees under certain interventions. We derive finite-sample bounds in the high-dimensional setting and investigate its performance on simulated datasets.

[1]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[2]  A. Wald The Fitting of Straight Lines if Both Variables are Subject to Error , 1940 .

[3]  Peter Dalgaard,et al.  R Development Core Team (2010): R: A language and environment for statistical computing , 2010 .

[4]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[5]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[6]  A. Dawid Influence Diagrams for Causal Modelling and Inference , 2002 .

[7]  Jin Tian,et al.  Causal Discovery from Changes , 2001, UAI.

[8]  Jonas Peters,et al.  BACKSHIFT: Learning causal cyclic graphs from unknown shift interventions , 2015, NIPS.

[9]  Alain Hauser,et al.  Jointly interventional and observational data: estimation of interventional Markov equivalence classes of directed acyclic graphs , 2013, 1303.3216.

[10]  Peter Bühlmann,et al.  Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs (Abstract) , 2011, UAI.

[11]  N. Meinshausen,et al.  Methods for causal inference from gene perturbation experiments and validation , 2016, Proceedings of the National Academy of Sciences.

[12]  A. Lewbel,et al.  Using Heteroscedasticity to Identify and Estimate Mismeasured and Endogenous Regressor Models , 2012 .

[13]  Cun-Hui Zhang,et al.  Rate Minimaxity of the Lasso and Dantzig Selector for the lq Loss in lr Balls , 2010, J. Mach. Learn. Res..

[14]  Sara van de Geer,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2011 .

[15]  Jonas Peters,et al.  Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.

[16]  T. Richardson Single World Intervention Graphs ( SWIGs ) : A Unification of the Counterfactual and Graphical Approaches to Causality , 2013 .

[17]  J. Robins,et al.  Signed directed acyclic graphs for causal inference , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[18]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[19]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[20]  Philip G. Wright,et al.  The tariff on animal and vegetable oils , 1928 .

[21]  J. Robins,et al.  Marginal Structural Models and Causal Inference in Epidemiology , 2000, Epidemiology.

[22]  Eric Tchetgen Tchetgen,et al.  Bounded, efficient and multiply robust estimation of average treatment effects using instrumental variables , 2016, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[23]  M. Maathuis,et al.  Estimating high-dimensional intervention effects from observational data , 2008, 0810.4214.

[24]  Joshua D. Angrist,et al.  Identification of Causal Effects Using Instrumental Variables , 1993 .

[25]  T. W. Anderson Asymptotically Efficient Estimation of Covariance Matrices with Linear Structure , 1973 .

[26]  Mehdi M. Kashani,et al.  Large-Scale Genetic Perturbations Reveal Regulatory Networks and an Abundance of Gene-Specific Repressors , 2014, Cell.

[27]  D. Rubin Estimating causal effects of treatments in randomized and nonrandomized studies. , 1974 .

[28]  D. Madigan,et al.  A characterization of Markov equivalence classes for acyclic digraphs , 1997 .

[29]  S. Geer,et al.  On the conditions used to prove oracle results for the Lasso , 2009, 0910.0722.

[30]  Vanessa Didelez,et al.  Assumptions of IV methods for observational epidemiology , 2010, 1011.0595.

[31]  Andreas Ritter,et al.  Structural Equations With Latent Variables , 2016 .

[32]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..