Causal Inference on Multidimensional Data Using Free Probability Theory

In this paper, we deal with the problem of inferring causal relations for multidimensional data. Based on the postulate that the distribution of the cause and the conditional distribution of the effect given cause are generated independently, we show that the covariance matrix of the mean embedding of the cause in reproducing kernel Hilbert space (RKHS) is free independent with the covariance matrix of the conditional embedding of the effect given cause. This, called freeness condition, induces a cause–effect asymmetry that a designed measurement is 0 in the causal direction but smaller than 0 in the anticausal direction, and it uncovers the causal direction. One important novel aspect of this paper is that we interpret the independence as a freeness condition between covariance matrices of RKHS distribution embeddings, and it has a wide applicability. We show that our freeness condition-based inference method succeeds in scenarios like additive noise cases, where other methods fail, by theoretical analysis and experimental results.

[1]  Aapo Hyvärinen,et al.  Pairwise likelihood ratios for estimation of non-Gaussian structural equation models , 2013, J. Mach. Learn. Res..

[2]  Bernhard Schölkopf,et al.  Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..

[3]  Marianne Winslett,et al.  Understanding Social Causalities Behind Human Action Sequences , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Bernhard Schölkopf,et al.  Telling cause from effect based on high-dimensional observations , 2009, ICML.

[5]  Bernhard Schölkopf,et al.  Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks , 2014, J. Mach. Learn. Res..

[6]  Lai-Wan Chan,et al.  Causal Inference on Discrete Data via Estimating Distance Correlations , 2016, Neural Computation.

[7]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[8]  Dan Voiculescu Free Probability Theory: Random Matrices and von Neumann Algebras , 1995 .

[9]  Melih Kandemir,et al.  Asymmetric Transfer Learning with Deep Gaussian Processes , 2015, ICML.

[10]  Lai-Wan Chan,et al.  Causal Discovery on Discrete Data with Extensions to Mixture Model , 2015, ACM Trans. Intell. Syst. Technol..

[11]  Lai-Wan Chan,et al.  Minimal Nonlinear Distortion Principle for Nonlinear Independent Component Analysis , 2008 .

[12]  Bernhard Schölkopf,et al.  Domain Adaptation with Conditional Transferable Components , 2016, ICML.

[13]  Alexander J. Smola,et al.  Hilbert space embeddings of conditional distributions with applications to dynamical systems , 2009, ICML '09.

[14]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[15]  Bernhard Schölkopf,et al.  Causal Inference Using the Algorithmic Markov Condition , 2008, IEEE Transactions on Information Theory.

[16]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[17]  Guy Lever,et al.  Conditional mean embeddings as regressors , 2012, ICML.

[18]  许超 Large-Margin Multi-Label Causal Feature Learning , 2015 .

[19]  Bernhard Schölkopf,et al.  Information-geometric approach to inferring causal directions , 2012, Artif. Intell..

[20]  Dominik Janzing,et al.  Testing whether linear equations are causal: A free probability theory approach , 2011, UAI.

[21]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[22]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[23]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[24]  Le Song,et al.  A unified kernel framework for nonparametric inference in graphical models ] Kernel Embeddings of Conditional Distributions , 2013 .

[25]  Bernhard Schölkopf,et al.  Causal Discovery via Reproducing Kernel Hilbert Space Embeddings , 2014, Neural Computation.

[26]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[27]  Lotharingien de Combinatoire Free Probability Theory and Non-crossing Partitions , 1997 .

[28]  Le Song,et al.  Nonparametric Tree Graphical Models , 2010, AISTATS.

[29]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[30]  R. Speicher Free Probability Theory , 1996, Oberwolfach Reports.

[31]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[32]  P. Bickel,et al.  Curse-of-dimensionality revisited: Collapse of the particle filter in very large scale systems , 2008, 0805.3034.

[33]  D. Voiculescu Limit laws for Random matrices and free products , 1991 .