Sparse-GEV: Sparse Latent Space Model for Multivariate Extreme Value Time Serie Modeling

In many applications of time series models, such as climate analysis and social media analysis, we are often interested in extreme events, such as heatwave, wind gust, and burst of topics. These time series data usually exhibit a heavy-tailed distribution rather than a Gaussian distribution. This poses great challenges to existing approaches due to the significantly different assumptions on the data distributions and the lack of sufficient past data on extreme events. In this paper, we propose the Sparse-GEV model, a latent state model based on the theory of extreme value modeling to automatically learn sparse temporal dependence and make predictions. Our model is theoretically significant because it is among the first models to learn sparse temporal dependencies among multivariate extreme value time series. We demonstrate the superior performance of our algorithm to the state-of-art methods, including Granger causality, copula approach, and transfer entropy, on one synthetic dataset, one climate dataset and two Twitter datasets.

[1]  K. Müller,et al.  Robustly estimating the flow direction of information in complex physical systems. , 2007, Physical review letters.

[2]  Emiliano A. Valdez,et al.  Claims Prediction with Dependence using Copula Models , 2005 .

[3]  P. Embrechts,et al.  Risk Management: Correlation and Dependence in Risk Management: Properties and Pitfalls , 2002 .

[4]  Naoki Abe,et al.  Grouped graphical Granger modeling for gene expression regulatory networks discovery , 2009, Bioinform..

[5]  Eric P. Smith,et al.  An Introduction to Statistical Modeling of Extreme Values , 2002, Technometrics.

[6]  Jennifer Neville,et al.  Randomization tests for distinguishing social influence and homophily effects , 2010, WWW '10.

[7]  Schreiber,et al.  Measuring information transfer , 2000, Physical review letters.

[8]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[9]  C. Granger Testing for causality: a personal viewpoint , 1980 .

[10]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[11]  Jonathan A. Tawn,et al.  Modelling extremes of the areal rainfall process. , 1996 .

[12]  Jonathan D. Cryer,et al.  Time Series Analysis , 1986 .

[13]  W. Landman Climate change 2007: the physical science basis , 2010 .

[14]  J. Teugels,et al.  Statistics of Extremes , 2004 .

[15]  A. Seth,et al.  Granger causality and transfer entropy are equivalent for Gaussian variables. , 2009, Physical review letters.

[16]  Eamonn J. Keogh,et al.  Disk aware discord discovery: finding unusual time series in terabyte sized datasets , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[17]  Yan Liu,et al.  Temporal causal modeling with graphical granger methods , 2007, KDD '07.

[18]  Vincent R. Gray Climate Change 2007: The Physical Science Basis Summary for Policymakers , 2007 .

[19]  A. Doucet,et al.  A Tutorial on Particle Filtering and Smoothing: Fifteen years later , 2008 .

[20]  M. Evans Statistical Distributions , 2000 .

[21]  Yan Liu,et al.  Spatial-temporal causal modeling for climate change attribution , 2009, KDD.

[22]  M. Eichler Granger causality and path diagrams for multivariate time series , 2007 .

[23]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[24]  L. Györfi,et al.  Nonparametric entropy estimation. An overview , 1997 .

[25]  Eugene S. Edgington,et al.  Randomization Tests , 2011, International Encyclopedia of Statistical Science.

[26]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[27]  Gabriel Huerta,et al.  Time-varying models for extreme values , 2007, Environmental and Ecological Statistics.

[28]  Johan Segers,et al.  Inference for clusters of extreme values , 2003 .

[29]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[30]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.