Learning Neural Causal Models from Unknown Interventions

Promising results have driven a recent surge of interest in continuous optimization methods for Bayesian network structure learning from observational data. However, there are theoretical limitations on the identifiability of underlying structures obtained from observational data alone. Interventional data provides much richer information about the underlying data-generating process. However, the extension and application of methods designed for observational data to include interventions is not straightforward and remains an open problem. In this paper we provide a general framework based on continuous optimization and neural networks to create models for the combination of observational and interventional data. The proposed method is even applicable in the challenging and realistic case that the identity of the intervened upon variable is unknown. We examine the proposed method in the setting of graph recovery both de novo and from a partially-known edge set. We establish strong benchmark results on several structure learning tasks, including structure recovery of both synthetic graphs as well as standard graphs from the Bayesian Network Repository.

[1]  Geoffrey E. Hinton Using fast weights to deblur old memories , 1987 .

[2]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[3]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[4]  J. Pearl Causal diagrams for empirical research , 1995 .

[5]  A. H. Murphy,et al.  Hailfinder: A Bayesian system for forecasting severe weather , 1996 .

[6]  Gregory F. Cooper,et al.  Causal Discovery from a Mixture of Experimental and Observational Data , 1999, UAI.

[7]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2003, J. Mach. Learn. Res..

[8]  Kristian Kristensen,et al.  The use of a Bayesian network in the design of a decision support system for growing malting barley without use of pesticides , 2002 .

[9]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[10]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[11]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[12]  Frederick Eberhardt,et al.  On the Number of Experiments Sufficient and in the Worst Case Necessary to Identify All Causal Relations Among N Variables , 2005, UAI.

[13]  Daniel Zelterman,et al.  Bayesian Artificial Intelligence , 2005, Technometrics.

[14]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[15]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[16]  Bernhard Schölkopf,et al.  A kernel-based causal learning algorithm , 2007, ICML '07.

[17]  K. Murphy,et al.  Belief net structure learning from uncertain interventions , 2007 .

[18]  Kevin P. Murphy,et al.  Bayesian structure learning using dynamic programming and MCMC , 2007, UAI.

[19]  Kevin P. Murphy,et al.  Exact Bayesian structure learning from uncertain interventions , 2007, AISTATS.

[20]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[21]  Bernhard Schölkopf,et al.  Inferring deterministic causal relations , 2010, UAI.

[22]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[23]  Peter Bühlmann,et al.  Characterization and Greedy Learning of Interventional Markov Equivalence Classes of Directed Acyclic Graphs (Abstract) , 2011, UAI.

[24]  Peter Spirtes,et al.  Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables , 2011, AISTATS.

[25]  Bernhard Schölkopf,et al.  Identifiability of Causal Graphs using Functional Models , 2011, UAI.

[26]  Bernhard Schölkopf,et al.  On causal and anticausal learning , 2012, ICML.

[27]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[28]  Bernhard Schölkopf,et al.  Towards a Learning Theory of Causation , 2015, 1502.02398.

[29]  Jonas Peters,et al.  Causal inference by using invariant prediction: identification and confidence intervals , 2015, 1501.01332.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Jonas Peters,et al.  BACKSHIFT: Learning causal cyclic graphs from unknown shift interventions , 2015, NIPS.

[32]  Evan O. Paull,et al.  Inferring causal molecular networks: empirical assessment through a community-based effort , 2016, Nature Methods.

[33]  Kailash Budhathoki,et al.  Causal Inference by Stochastic Complexity , 2017, ArXiv.

[34]  Kun Zhang,et al.  Learning Causal Structures Using Regression Invariance , 2017, NIPS.

[35]  Yura N. Perov,et al.  A Universal Marginalizer for Amortized Inference in Generative Models , 2017, ArXiv.

[36]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[37]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[38]  Christina Heinze-Deml,et al.  Causal Structure Learning , 2017, 1706.09141.

[39]  Bernhard Schölkopf,et al.  Invariant Models for Causal Transfer Learning , 2015, J. Mach. Learn. Res..

[40]  Michèle Sebag,et al.  Learning Functional Causal Models with Generative Neural Networks , 2018 .

[41]  I. Guyon,et al.  Causal Generative Neural Networks , 2017, 1711.08936.

[42]  Pradeep Ravikumar,et al.  DAGs with NO TEARS: Continuous Optimization for Structure Learning , 2018, NeurIPS.

[43]  Yee Whye Teh,et al.  Causal Inference via Kernel Deviance Measures , 2018, NeurIPS.

[44]  Mihaela van der Schaar,et al.  GAIN: Missing Data Imputation using Generative Adversarial Nets , 2018, ICML.

[45]  J. Pearl,et al.  The Book of Why: The New Science of Cause and Effect , 2018 .

[46]  David Lopez-Paz,et al.  SAM: Structural Agnostic Model, Causal Discovery and Penalized Adversarial Learning , 2018 .

[47]  Christina Heinze-Deml,et al.  Invariant Causal Prediction for Nonlinear Models , 2017, Journal of Causal Inference.

[48]  Junier B. Oliva,et al.  Flow Models for Arbitrary Conditional Likelihoods , 2019, ArXiv.

[49]  Mo Yu,et al.  DAG-GNN: DAG Structure Learning with Graph Neural Networks , 2019, ICML.

[50]  Aapo Hyvärinen,et al.  Causal Discovery with General Non-Linear Relationships using Non-Linear ICA , 2019, UAI.

[51]  Zeb Kurth-Nelson,et al.  Causal Reasoning from Meta-reinforcement Learning , 2019, ArXiv.

[52]  Dmitry Vetrov,et al.  Variational Autoencoder with Arbitrary Conditioning , 2018, ICLR.

[53]  Rajen Dinesh Shah,et al.  The hardness of conditional independence testing and the generalised covariance measure , 2018, The Annals of Statistics.

[54]  Christopher Joseph Pal,et al.  A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms , 2019, ICLR.

[55]  Joris M. Mooij,et al.  Joint Causal Inference from Multiple Contexts , 2016, J. Mach. Learn. Res..

[56]  Tristan Deleu,et al.  Gradient-Based Neural DAG Learning , 2019, ICLR.

[57]  Zhitang Chen,et al.  Causal Discovery with Reinforcement Learning , 2019, ICLR.