Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data

Standard causal discovery methods must fit a new model whenever they encounter samples from a new underlying causal graph. However, these samples often share relevant information - for instance, the dynamics describing the effects of causal relations - which is lost when following this approach. We propose Amortized Causal Discovery, a novel framework that leverages such shared dynamics to learn to infer causal relations from time-series data. This enables us to train a single, amortized model that infers causal relations across samples with different underlying causal graphs, and thus makes use of the information that is shared. We demonstrate experimentally that this approach, implemented as a variational model, leads to significant improvements in causal discovery performance, and show how it can be extended to perform well under hidden confounding.

[1]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[2]  Vincent Y. F. Tan,et al.  Economy Statistical Recurrent Units For Inferring Nonlinear Granger Causality , 2020, ICLR.

[3]  Li Sun,et al.  End-to-end Recurrent Multi-Object Tracking and Trajectory Prediction with Relational Reasoning , 2019, ArXiv.

[4]  Stefan Bauer,et al.  The Arrow of Time in Multivariate Time Series , 2016, ICML.

[5]  A. Seth,et al.  Granger causality and transfer entropy are equivalent for Gaussian variables. , 2009, Physical review letters.

[6]  E. Fox,et al.  Neural Granger Causality for Nonlinear Time Series , 2018, 1802.05842.

[7]  Y. Matsuda Graphical modelling for multivariate time series , 2004 .

[8]  Ciar'an M. Lee,et al.  Integrating overlapping datasets using bivariate causal discovery , 2019, AAAI.

[9]  Doina Bucur,et al.  Causal Discovery with Attention-Based Convolutional Neural Networks , 2019, Mach. Learn. Knowl. Extr..

[10]  Pengtao Xie,et al.  Specific and Shared Causal Relation Modeling and Mechanism-Based Clustering , 2019, NeurIPS.

[11]  Christopher Joseph Pal,et al.  A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms , 2019, ICLR.

[12]  Martin Szummer,et al.  Amortized learning of neural causal representations , 2020, ArXiv.

[13]  Aapo Hyvärinen,et al.  Estimation of a Structural Vector Autoregression Model Using Non-Gaussianity , 2010, J. Mach. Learn. Res..

[14]  Yan Liu,et al.  An Examination of Practical Granger Causality Inference , 2013, SDM.

[15]  M. Eichler Causal inference in time series analysis , 2012 .

[16]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[17]  Jan Kautz,et al.  Discovering Nonlinear Relations with Minimum Predictive Information Regularization , 2020, ArXiv.

[18]  Ben Poole,et al.  Categorical Reparametrization with Gumble-Softmax , 2017, ICLR 2017.

[19]  Volker Roth,et al.  Information Bottleneck for Estimating Treatment Effects with Systematically Missing Covariates , 2020, Entropy.

[20]  Richard S. Zemel,et al.  Gated Graph Sequence Neural Networks , 2015, ICLR.

[21]  Kun Zhang,et al.  Causal Discovery from Multiple Data Sets with Non-Identical Variable Sets , 2020, AAAI.

[22]  Daniel Malinsky,et al.  Causal Structure Learning from Time Series Causal Structure Learning from Multivariate Time Series in Settings with Unmeasured Confounding , 2018 .

[23]  Shohei Shimizu,et al.  Joint estimation of linear non-Gaussian acyclic models , 2011, Neurocomputing.

[24]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[25]  Pradeep Ravikumar,et al.  Learning Sparse Nonparametric DAGs , 2020, AISTATS.

[26]  Aapo Hyvärinen,et al.  DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model , 2011, J. Mach. Learn. Res..

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  J. Kruskal Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics , 1977 .

[29]  P. Spirtes,et al.  Review of Causal Discovery Methods Based on Graphical Models , 2019, Front. Genet..

[30]  Jürgen Schmidhuber,et al.  Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions , 2018, ICLR.

[31]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[32]  Yuanjia Wang,et al.  Learning Subject-Specific Directed Acyclic Graphs With Mixed Effects Structural Equation Models From Observational Data , 2018, Front. Genet..

[33]  Illtyd Trethowan Causality , 1938 .

[34]  A. Philip Dawid,et al.  Causality : statistical perspectives and applications , 2012 .

[35]  Max Welling,et al.  Causal Effect Inference with Deep Latent-Variable Models , 2017, NIPS 2017.

[36]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[37]  Sergey Levine,et al.  Recurrent Independent Mechanisms , 2019, ICLR.

[38]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[39]  David Duvenaud,et al.  Inference Suboptimality in Variational Autoencoders , 2018, ICML.

[40]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[41]  P. Hoyer,et al.  On Causal Discovery from Time Series Data using FCI , 2010 .

[42]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[43]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[44]  Bernhard Schölkopf,et al.  Causal Inference on Time Series using Restricted Structural Equation Models , 2013, NIPS.

[45]  Yoshiki Kuramoto,et al.  Self-entrainment of a population of coupled non-linear oscillators , 1975 .

[46]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[47]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[48]  Leslie Pack Kaelbling,et al.  Neural Relational Inference with Fast Modular Meta-learning , 2019, NeurIPS.

[49]  Bernhard Schölkopf,et al.  Causal Inference by Identification of Vector Autoregressive Processes with Hidden Components , 2015, ICML.

[50]  Razvan Pascanu,et al.  A simple neural network module for relational reasoning , 2017, NIPS.

[51]  Dieter Fox,et al.  Causal Discovery in Physical Systems from Videos , 2020, NeurIPS.

[52]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[53]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[54]  Bernhard Schölkopf,et al.  Seeing the Arrow of Time , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[55]  R. Zemel,et al.  Neural Relational Inference for Interacting Systems , 2018, ICML.

[56]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[57]  Christina Heinze-Deml,et al.  Causal Structure Learning , 2017, 1706.09141.

[58]  Mark W. Woolrich,et al.  Network modelling methods for FMRI , 2011, NeuroImage.

[59]  Tristan Deleu,et al.  Gradient-Based Neural DAG Learning , 2019, ICLR.

[60]  Schreiber,et al.  Measuring information transfer , 2000, Physical review letters.

[61]  Samuel S. Schoenholz,et al.  Neural Message Passing for Quantum Chemistry , 2017, ICML.

[62]  F. Eberhardt,et al.  LEARNING CAUSAL STRUCTURE FROM MULTIPLE DATASETS WITH SIMILAR VARIABLE SETS , 2014 .

[63]  C. Matias,et al.  Identifiability of parameters in latent structure models with many observed variables , 2008, 0809.5032.