Learning linear cyclic causal models with latent variables

Identifying cause-effect relationships between variables of interest is a central problem in science. Given a set of experiments we describe a procedure that identifies linear models that may contain cycles and latent variables. We provide a detailed description of the model family, full proofs of the necessary and sufficient conditions for identifiability, a search algorithm that is complete, and a discussion of what can be done when the identifiability conditions are not satisfied. The algorithm is comprehensively tested in simulations, comparing it to competing algorithms in the literature. Furthermore, we adapt the procedure to the problem of cellular network inference, applying it to the biologically realistic data of the DREAMchallenges. The paper provides a full theoretical foundation for the causal discovery procedure first presented by Eberhardt et al. (2010) and Hyttinen et al. (2010).

[1]  Joel Spencer,et al.  Minimal completely separating systems , 1970 .

[2]  Mark W. Schmidt,et al.  Modeling Discrete Interventional Data using Directed Cyclic Graphical Models , 2009, UAI.

[3]  Gregory F. Cooper,et al.  Causal Discovery from a Mixture of Experimental and Observational Data , 1999, UAI.

[4]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[5]  R. Scheines,et al.  Interventions and Causal Inference , 2007, Philosophy of Science.

[6]  Richard Scheines,et al.  Discovering Causal Structure: Artificial Intelligence, Philosophy of Science, and Statistical Modeling , 1987 .

[7]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[8]  Peter Spirtes,et al.  Directed Cyclic Graphical Representations of Feedback Models , 1995, UAI.

[9]  Daphne Koller,et al.  Active Learning for Structure in Bayesian Networks , 2001, IJCAI.

[10]  Kevin P. Murphy,et al.  Exact Bayesian structure learning from uncertain interventions , 2007, AISTATS.

[11]  H. R. Pitt Divergent Series , 1951, Nature.

[12]  Andrea Califano,et al.  Lessons from the DREAM 2 Challenges A Community Effort to Assess Biological Network Inference , 2009 .

[13]  S. J. Mason Feedback Theory-Further Properties of Signal Flow Graphs , 1956, Proceedings of the IRE.

[14]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[15]  Tom Heskes,et al.  Causal discovery in multiple models from different experiments , 2010, NIPS.

[16]  Kevin Murphy,et al.  Active Learning of Causal Bayes Net Structure , 2006 .

[17]  Frederick Eberhardt,et al.  Experiment selection for causal discovery , 2013, J. Mach. Learn. Res..

[18]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[19]  Jon Williamson,et al.  Causality and Probability in the Sciences , 2007 .

[20]  S. Lauritzen,et al.  Chain graph models and their causal interpretations , 2002 .

[21]  David Heckerman,et al.  Learning Gaussian Networks , 1994, UAI.

[22]  Mikko Koivisto,et al.  Exact Bayesian Structure Discovery in Bayesian Networks , 2004, J. Mach. Learn. Res..

[23]  Erik P. Nyberg,et al.  Informative Interventions , 2006 .

[24]  Frederick Eberhardt,et al.  Combining Experiments to Discover Linear Cyclic Models with Latent Variables , 2010, AISTATS.

[25]  Dario Floreano,et al.  Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods , 2009, J. Comput. Biol..

[26]  Bernard Manderick,et al.  Learning Causal Bayesian Networks from Observations and Experiments: A Decision Theoretic Approach , 2006, MDAI.

[27]  N. D. Clarke,et al.  Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges , 2010, PloS one.

[28]  F. Fisher A Correspondence Principle for Simultaneous Equation Models , 1970 .

[29]  S. Wright The Method of Path Coefficients , 1934 .

[30]  Munther A. Dahleh,et al.  Structure learning in causal cyclic networks , 2008 .

[31]  Gustavo Stolovitzky,et al.  Lessons from the DREAM2 Challenges , 2009, Annals of the New York Academy of Sciences.

[32]  Frederick Eberhardt,et al.  On the Number of Experiments Sufficient and in the Worst Case Necessary to Identify All Causal Relations Among N Variables , 2005, UAI.

[33]  Judea Pearl,et al.  Causal networks: semantics and expressiveness , 2013, UAI.

[34]  Frederick Eberhardt,et al.  Causal discovery for linear cyclic models with latent variables , 2010 .

[35]  Frederick Eberhardt,et al.  Noisy-OR Models with Latent Confounding , 2011, UAI.

[36]  J. I The Design of Experiments , 1936, Nature.