The Randomized Causation Coefficient

We are interested in learning causal relationships between pairs of random variables, purely from observational data. To effectively address this task, the state-of-the-art relies on strong assumptions on the mechanisms mapping causes to effects, such as invertibility or the existence of additive noise, which only hold in limited situations. On the contrary, this short paper proposes to learn how to perform causal inference directly from data, without the need of feature engineering. In particular, we pose causality as a kernel mean embedding classification problem, where inputs are samples from arbitrary probability distributions on pairs of random variables, and labels are types of causal relationships. We validate the performance of our method on synthetic and real-world data against the state-of-the-art. Moreover, we submitted our algorithm to the ChaLearn's "Fast Causation Coefficient Challenge" competition, with which we won the fastest code prize and ranked third in the overall leaderboard.

[1]  W. Rudin,et al.  Fourier Analysis on Groups. , 1965 .

[2]  O. Penrose The Direction of Time , 1962 .

[3]  D. Braddon-Mitchell NATURE'S CAPACITIES AND THEIR MEASUREMENT , 1991 .

[4]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[5]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[6]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[7]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[8]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[9]  Bernhard Schölkopf,et al.  Inferring deterministic causal relations , 2010, UAI.

[10]  Bernhard Schölkopf,et al.  Probabilistic latent variable models for distinguishing between cause and effect , 2010, NIPS.

[11]  Bernhard Schölkopf,et al.  Hilbert Space Embeddings and Metrics on Probability Measures , 2009, J. Mach. Learn. Res..

[12]  Bernhard Schölkopf,et al.  Information-geometric approach to inferring causal directions , 2012, Artif. Intell..

[13]  Bernhard Schölkopf,et al.  Learning from Distributions via Support Measure Machines , 2012, NIPS.

[14]  Bernhard Schölkopf,et al.  On causal and anticausal learning , 2012, ICML.

[15]  Bernhard Schölkopf,et al.  The Randomized Dependence Coefficient , 2013, NIPS.

[16]  Bernhard Schölkopf,et al.  Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..

[17]  Bernhard Schölkopf,et al.  Kernel Mean Estimation and Stein Effect , 2013, ICML.

[18]  Bernhard Schölkopf,et al.  Consistency of Causal Inference under the Additive Noise Model , 2013, ICML.

[19]  David Lopez-Paz,et al.  Towards a Learning Theory of Causation , 2015 .

[20]  Barnabás Póczos,et al.  Two-stage sampled learning theory on distributions , 2015, AISTATS.

[21]  B. Schölkopf,et al.  Justifying Information-Geometric Causal Inference , 2014, 1402.2499.

[22]  Bernhard Schölkopf,et al.  Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks , 2014, J. Mach. Learn. Res..