Causal Discovery Using Regression-Based Conditional Independence Tests

Conditional independence (CI) testing is an important tool in causal discovery. Generally, by using CI tests, a set of Markov equivalence classes w.r.t. the observed data can be estimated by checking whether each pair of variables x and y is d -separated, given a set of variables Z. Due to the curse of dimensionality, CI testing is often difficult to return a reliable result for high-dimensional Z. In this paper, we propose a regression-based CI test to relax the test of x ⊥ y | Z to simpler unconditional independence tests of x − f ( Z ) ⊥ y − g ( Z ), and x − f ( Z ) ⊥ Z or y − g ( Z ) ⊥ Z under the assumption that the data-generating procedure follows additive noise models (ANMs). When the ANM is identifiable, we prove that x − f ( Z ) ⊥ y − g ( Z ) ⇒ x ⊥ y | Z . We also show that 1) f and g can be easily estimated by regression, 2) our test is more powerful than the state-of-the-art kernel CI tests, and 3) existing causal learning algorithms can infer much more causal directions by using the proposed method.

[1]  D. Edwards Introduction to graphical modelling , 1995 .

[2]  Ruichu Cai,et al.  Causal gene identification using combinatorial V-structure search , 2013, Neural Networks.

[3]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[4]  Illtyd Trethowan Causality , 1938 .

[5]  Bernhard Schölkopf,et al.  Causal Inference on Discrete Data Using Additive Noise Models , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  R. Shibata,et al.  PARTIAL CORRELATION AND CONDITIONAL CORRELATION AS MEASURES OF CONDITIONAL INDEPENDENCE , 2004 .

[7]  Wicher P. Bergsma,et al.  Testing conditional independence for continuous random variables , 2004 .

[8]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[9]  Bernhard Schölkopf,et al.  Regression by dependence minimization and its application to causal inference in additive noise models , 2009, ICML '09.

[10]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[11]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[12]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[13]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[14]  H. White,et al.  A NONPARAMETRIC HELLINGER METRIC TEST FOR CONDITIONAL INDEPENDENCE , 2008, Econometric Theory.

[15]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[16]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.