Causal Discovery and Hidden Driving Force Estimation from Nonstationary/Heterogeneous Data

It is commonplace to encounter nonstationary or heterogeneous data. Such a distribution shift feature presents both challenges and opportunities for causal discovery, of which the underlying generating process changes over time or across domains. In this paper, we develop a principled framework for causal discovery from such data, called Constraint-based causal Discovery from NOnstationary/heterogeneous Data (CD-NOD), which addresses two important questions. First, we propose an enhanced constraint-based procedure to detect variables whose local mechanisms change and recover the skeleton of the causal structure over observed variables. Second, we present a way to determine causal orientations by making use of independent changes in the data distribution implied by the underlying causal model, benefiting from information carried by changing distributions. After learning the causal structure, next, we investigate how to efficiently estimate the `driving force' of the nonstationarity of a causal mechanism. That is, we aim to extract from data a low-dimensional and interpretable representation of changes. The proposed methods are totally nonparametric, with no restrictions on data distributions and causal mechanisms, and do not rely on window segmentation. Furthermore, we find that nonstationarity benefits causal structure identification with particular types of confounders. Finally, we show the tight connection between nonstationarity/heterogeneity and soft intervention in causal discovery. Experimental results on various synthetic and real-world data sets (task-fMRI and stock data) are presented to demonstrate the efficacy of the proposed methods.

[1]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[2]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[3]  Bernhard Schölkopf,et al.  Discovering Temporal Causal Relations from Subsampled Data , 2015, ICML.

[4]  Le Song,et al.  A Kernel Statistical Test of Independence , 2007, NIPS.

[5]  Vince D. Calhoun,et al.  Dynamic modeling of neuronal responses in fMRI using cubature Kalman filtering , 2011, NeuroImage.

[6]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[7]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[8]  N. Hengartner,et al.  Structural learning with time‐varying components: tracking the cross‐section of financial time series , 2005 .

[9]  Thomas S. Richardson,et al.  A Discovery Algorithm for Directed Cyclic Graphs , 1996, UAI.

[10]  R. Scheines,et al.  Interventions and Causal Inference , 2007, Philosophy of Science.

[11]  D. Weed On the logic of causal inference. , 1986, American journal of epidemiology.

[12]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[13]  Bernhard Schölkopf,et al.  Nonlinear causal discovery with additive noise models , 2008, NIPS.

[14]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[15]  Alan M. Frieze,et al.  Random graphs , 2006, SODA '06.

[16]  Mokshay Madiman,et al.  On the entropy of sums , 2008, 2008 IEEE Information Theory Workshop.

[17]  V. Calhoun,et al.  The Chronnectome: Time-Varying Connectivity Networks as the Next Frontier in fMRI Data Discovery , 2014, Neuron.

[18]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[19]  Christopher Meek,et al.  Strong completeness and faithfulness in Bayesian networks , 1995, UAI.

[20]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[21]  Bernhard Schölkopf,et al.  Domain Adaptation under Target and Conditional Shift , 2013, ICML.

[22]  Jin Tian,et al.  Causal Discovery from Changes: a Bayesian Approach , 2001, UAI 2001.

[23]  Bernhard Schölkopf,et al.  Identification of Time-Dependent Causal Model: A Gaussian Process Treatment , 2015, IJCAI.

[24]  Bernhard Schölkopf,et al.  Causal Discovery from Nonstationary/Heterogeneous Data: Skeleton Estimation and Orientation Determination , 2017, IJCAI.

[25]  Kun Zhang,et al.  Multi-domain Causal Structure Learning in Linear Systems , 2018, NeurIPS.

[26]  Bernhard Schölkopf,et al.  On the Identifiability and Estimation of Functional Causal Models in the Presence of Outcome-Dependent Selection , 2016, UAI.

[27]  Ryan P. Adams,et al.  Bayesian Online Changepoint Detection , 2007, 0710.3742.

[28]  Tom M. Mitchell,et al.  Training fMRI Classifiers to Detect Cognitive States across Multiple Human Subjects , 2003, NIPS 2003.

[29]  Aapo Hyvärinen,et al.  Estimation of a Structural Vector Autoregression Model Using Non-Gaussianity , 2010, J. Mach. Learn. Res..

[30]  Bernhard Schölkopf,et al.  Generalized Score Functions for Causal Discovery , 2018, KDD.

[31]  E. Xing,et al.  A state-space mixed membership blockmodel for dynamic network tomography , 2008, 0901.0135.

[32]  Bernhard Schölkopf,et al.  Behind Distribution Shift: Mining Driving Forces of Changes and Causal Arrows , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[33]  Bernhard Schölkopf,et al.  Distinguishing Cause from Effect Based on Exogeneity , 2015, ArXiv.

[34]  Bernhard Schölkopf,et al.  On Estimation of Functional Causal Models , 2015, ACM Trans. Intell. Syst. Technol..

[35]  Le Song,et al.  Time-Varying Dynamic Bayesian Networks , 2009, NIPS.

[36]  Dacheng Tao,et al.  Causal Generative Domain Adaptation Networks , 2018, ArXiv.

[37]  David Danks,et al.  Tracking Time-varying Graphical Structure , 2013, NIPS.

[38]  Aapo Hyvärinen,et al.  Causality Discovery with Additive Disturbances: An Information-Theoretical Perspective , 2009, ECML/PKDD.