Sparse Causal Discovery in Multivariate Time Series

Our goal is to estimate causal interactions in multivariate time series. Using vector autoregressive (VAR) models, these can be defined based on non-vanishing coefficients belonging to respective time-lagged instances. As in most cases a parsimonious causality structure is assumed, a promising approach to causal discovery consists in fitting VAR models with an additional sparsity-promoting regularization. Along this line we here propose that sparsity should be enforced for the subgroups of coefficients that belong to each pair of time series, as the absence of a causal relation requires the coefficients for all time-lags to become jointly zero. Such behavior can be achieved by means of l1-l2-norm regularized regression, for which an efficient active set solver has been proposed recently. Our method is shown to outperform standard methods in recovering simulated causality graphs. The results are on par with a second novel approach which uses multiple statistical testing.

[1]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[2]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[3]  Yan Liu,et al.  Temporal causal modeling with graphical granger methods , 2007, KDD '07.

[4]  Gunnar Rätsch,et al.  Large Scale Multiple Kernel Learning , 2006, J. Mach. Learn. Res..

[5]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[6]  B. Porat,et al.  Digital Spectral Analysis with Applications. , 1988 .

[7]  P. Bühlmann,et al.  The group lasso for logistic regression , 2008 .

[8]  T. Hothorn,et al.  Simultaneous Inference in General Parametric Models , 2008, Biometrical journal. Biometrische Zeitschrift.

[9]  Volker Roth,et al.  The Group-Lasso for generalized linear models: uniqueness of solutions and efficient algorithms , 2008, ICML '08.

[10]  A. Genz Numerical Computation of Multivariate Normal Probabilities , 1992 .

[11]  K. Müller,et al.  Robustly estimating the flow direction of information in complex physical systems. , 2007, Physical review letters.

[12]  Lester Melie-García,et al.  Estimating brain functional connectivity with sparse multivariate autoregression , 2005, Philosophical Transactions of the Royal Society B: Biological Sciences.

[13]  Mathias Drton,et al.  A SINful approach to Gaussian graphical model selection , 2005 .

[14]  Andreas Ziehe,et al.  Combining sparsity and rotational invariance in EEG/MEG source reconstruction , 2008, NeuroImage.

[15]  Jos F. Sturm,et al.  A Matlab toolbox for optimization over symmetric cones , 1999 .

[16]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[17]  Korbinian Strimmer,et al.  Learning causal networks from systems biology time course data: an effective model selection procedure for the vector autoregressive process , 2007, BMC Bioinformatics.

[18]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .