High-recall causal discovery for autocorrelated time series with latent confounders

We present a new method for linear and nonlinear, lagged and contemporaneous constraint-based causal discovery from observational time series in the presence of latent confounders. We show that existing causal discovery methods such as FCI and variants suffer from low recall in the autocorrelated time series case and identify low effect size of conditional independence tests as the main reason. Information-theoretical arguments show that effect size can often be increased if causal parents are included in the conditioning sets. To identify parents early on, we suggest an iterative procedure that utilizes novel orientation rules to determine ancestral relationships already during the edge removal phase. We prove that the method is order-independent, and sound and complete in the oracle case. Extensive simulation studies for different numbers of variables, time lags, sample sizes, and further cases demonstrate that our method indeed achieves much higher recall than existing methods while keeping false positives at the desired level. This performance gain grows with stronger autocorrelation. Our method also covers causal discovery for non-time series data as a special case. We provide Python code for all methods involved in the simulation studies.

[1]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[2]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[3]  Bernhard Schölkopf,et al.  Elements of Causal Inference: Foundations and Learning Algorithms , 2017 .

[4]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[5]  Bernhard Schölkopf,et al.  Inferring causation from time series in Earth system sciences , 2019, Nature Communications.

[6]  Alexander J. Smola,et al.  Gaussian Processes for Independence Tests with Non-iid Data in Causal Inference , 2015, ACM Trans. Intell. Syst. Technol..

[7]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[8]  Peter Spirtes,et al.  Causal discovery and inference: concepts and recent methodological advances , 2016, Applied Informatics.

[9]  Fattaneh Jabbari,et al.  Discovery of Causal Models that Contain Latent Variables Through Bayesian Scoring of Independence Constraints , 2017, ECML/PKDD.

[10]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[11]  Diego Colombo,et al.  Order-independent constraint-based causal structure learning , 2012, J. Mach. Learn. Res..

[12]  Sebastian Engelke,et al.  Graphical models for extremes , 2018, Journal of the Royal Statistical Society: Series B (Statistical Methodology).

[13]  P. Hoyer,et al.  On Causal Discovery from Time Series Data using FCI , 2010 .

[14]  Thomas S. Richardson,et al.  Causal Inference in the Presence of Latent Variables and Selection Bias , 1995, UAI.

[15]  Aapo Hyvärinen,et al.  Causal modelling combining instantaneous and lagged effects: an identifiable model based on non-Gaussianity , 2008, ICML '08.

[16]  Bernhard Schölkopf,et al.  Causal Inference on Time Series using Restricted Structural Equation Models , 2013, NIPS.

[17]  Dino Sejdinovic,et al.  Detecting and quantifying causal associations in large nonlinear time series datasets , 2017, Science Advances.

[18]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[19]  V. Chavez-Demoulin,et al.  Causal mechanism of extreme river discharges in the upper Danube basin network , 2019, Journal of the Royal Statistical Society: Series C (Applied Statistics).

[20]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[21]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[22]  Daniel Malinsky,et al.  Causal Structure Learning from Time Series Causal Structure Learning from Multivariate Time Series in Settings with Unmeasured Confounding , 2018 .

[23]  Clark Glymour,et al.  Search for Additive Nonlinear Time Series Causal Models , 2008, J. Mach. Learn. Res..

[24]  P. Spirtes,et al.  Ancestral graph Markov models , 2002 .

[25]  Vasant Honavar,et al.  Towards Robust Relational Causal Discovery , 2019, UAI.

[26]  Jiji Zhang,et al.  Adjacency-Faithfulness and Conservative Causal Inference , 2006, UAI.

[27]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[28]  J G Daugman,et al.  Information Theory and Coding , 2005 .

[29]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[30]  Jakob Runge,et al.  Quantifying information transfer and mediation along causal pathways in complex systems. , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[31]  Thomas S. Richardson,et al.  Learning high-dimensional directed acyclic graphs with latent and selection variables , 2011, 1104.5617.