Estimation of Bivariate Structural Causal Models by Variational Gaussian Process Regression Under Likelihoods Parametrised by Normalising Flows

One major drawback of state-of-the-art artificial intelligence is its lack of explainability. One approach to solve the problem is taking causality into account. Causal mechanisms can be described by structural causal models. In this work, we propose a method for estimating bivariate structural causal models using a combination of normalising flows applied to density estimation and variational Gaussian process regression for post-nonlinear models. It facilitates causal discovery, i.e. distinguishing cause and effect, by either the independence of cause and residual or a likelihood ratio test. Our method which estimates post-nonlinear models can better explain a variety of real-world causeeffect pairs than a simple additive noise model. Though it remains difficult to exploit this benefit regarding all pairs from the Tübingen benchmark database, we demonstrate that combining the additive noise model approach with our method significantly enhances causal discovery.

[1]  James Hensman,et al.  Scalable Variational Gaussian Process Classification , 2014, AISTATS.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Milton Abramowitz,et al.  Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables , 1964 .

[4]  M. Abramowitz,et al.  Handbook of Mathematical Functions, with Formulas, Graphs, and Mathematical Tables , 1966 .

[5]  J. Pearl,et al.  Causal Inference in Statistics: A Primer , 2016 .

[6]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[7]  Judea Pearl,et al.  Theoretical Impediments to Machine Learning With Seven Sparks from the Causal Revolution , 2018, WSDM.

[8]  Christopher K. I. Williams,et al.  Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning) , 2005 .

[9]  Roger Woodard,et al.  Interpolation of Spatial Data: Some Theory for Kriging , 1999, Technometrics.

[10]  Bernhard Schölkopf,et al.  On Estimation of Functional Causal Models , 2015, ACM Trans. Intell. Syst. Technol..

[11]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[12]  Carl E. Rasmussen,et al.  A Unifying View of Sparse Approximate Gaussian Process Regression , 2005, J. Mach. Learn. Res..

[13]  Christopher Leckie,et al.  Invertible Generative Modeling using Linear Rational Splines , 2020, AISTATS.

[14]  Iain Murray,et al.  Neural Spline Flows , 2019, NeurIPS.

[15]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[16]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[17]  Eric Nalisnick,et al.  Normalizing Flows for Probabilistic Modeling and Inference , 2019, J. Mach. Learn. Res..

[18]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[19]  Daniel C. Castro,et al.  Deep Structural Causal Models for Tractable Counterfactual Inference , 2020, NeurIPS.

[20]  Bernhard Schölkopf,et al.  Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks , 2014, J. Mach. Learn. Res..