Grouped graphical Granger modeling methods for temporal causal modeling

We develop and evaluate an approach to causal modeling based on time series data, collectively referred to as "grouped graphical Granger modeling methods." Graphical Granger modeling uses graphical modeling techniques on time series data and invokes the notion of "Granger causality" to make assertions on causality among a potentially large number of time series variables through inference on time-lagged effects. The present paper proposes a novel enhancement to the graphical Granger methodology by developing and applying families of regression methods that are sensitive to group information among variables, to leverage the group structure present in the lagged temporal variables according to the time series they belong to. Additionally, we propose a new family of algorithms we call group boosting, as an improved component of grouped graphical Granger modeling over the existing regression methods with grouped variable selection in the literature (e.g group Lasso). The introduction of group boosting methods is primarily motivated by the need to deal with non-linearity in the data. We perform empirical evaluation to confirm the advantage of the grouped graphical Granger methods over the standard (non-grouped) methods, as well as that specific to the methods based on group boosting. This advantage is also demonstrated for the real world application of gene regulatory network discovery from time-course microarray data.

[1]  Ji Zhu,et al.  A ug 2 01 0 Group Variable Selection via a Hierarchical Lasso and Its Oracle Property Nengfeng Zhou Consumer Credit Risk Solutions Bank of America Charlotte , NC 28255 , 2010 .

[2]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[3]  B. Peter BOOSTING FOR HIGH-DIMENSIONAL LINEAR MODELS , 2006 .

[4]  G. Toffolo,et al.  CNET: an algorithm for Reverse Engineering of Causal Gene Networks , 2008 .

[5]  W. Enders Applied Econometric Time Series , 1994 .

[6]  P. Zhao,et al.  Grouped and Hierarchical Model Selection through Composite Absolute Penalties , 2007 .

[7]  M. Eichler Granger causality and path diagrams for multivariate time series , 2007 .

[8]  C. Granger Testing for causality: a personal viewpoint , 1980 .

[9]  Torsten Hothorn,et al.  Twin Boosting: improved feature selection and prediction , 2010, Stat. Comput..

[10]  P. Bühlmann,et al.  Boosting with the L2-loss: regression and classification , 2001 .

[11]  Naoki Abe,et al.  Grouped graphical Granger modeling for gene expression regulatory networks discovery , 2009, Bioinform..

[12]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[13]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[14]  R. Dahlhaus,et al.  1 Causality and graphical models in time series analysis , 2002 .

[15]  Francis R. Bach,et al.  Consistency of the group Lasso and multiple kernel learning , 2007, J. Mach. Learn. Res..

[16]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[17]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[18]  Snigdhansu Chatterjee,et al.  Causality and pathway search in microarray time series experiment , 2007, Bioinform..

[19]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[20]  Clifford M. Hurvich,et al.  Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion , 1998 .

[21]  Yan Liu,et al.  Temporal causal modeling with graphical granger methods , 2007, KDD '07.