Statistical Models Coupling Allows for Complex Local Multivariate Time Series Analysis

The increased availability of multivariate time-series asks for the development of suitable methods able to holistically analyse them. To this aim, we propose a novel flexible method for data-mining, forecasting and causal patterns detection that leverages the coupling of Hidden Markov Models and Gaussian Graphical Models. Given a multivariate non-stationary time-series, the proposed method simultaneously clusters time points while understanding probabilistic relationships among variables. The clustering divides the time points into stationary sub-groups whose underlying distribution can be inferred through a graphical model. Such coupling can be further exploited to build a time-varying regression model which allows to both make predictions and obtain insights on the presence of causal patterns. We extensively validate the proposed approach on synthetic data showing that it has better performance than the state of the art on clustering, graphical models inference and prediction. Finally, to demonstrate the applicability of our approach in real-world scenarios, we exploit its characteristics to build a profitable investment portfolio. Results show that we are able to improve the state of the art, by going from a -%20 profit to a noticeable 80%.

[1]  C. Sims MACROECONOMICS AND REALITY , 1977 .

[2]  Genevera I. Allen,et al.  A Local Poisson Graphical Model for Inferring Networks From Sequencing Data , 2013, IEEE Transactions on NanoBioscience.

[3]  Pradeep Ravikumar,et al.  Graphical models via univariate exponential family distributions , 2013, J. Mach. Learn. Res..

[4]  Emily B. Fox,et al.  Sparse plus low-rank graphical models of time series for functional connectivity in MEG , 2016 .

[5]  Esther Ruiz,et al.  Frontiers in VaR forecasting and backtesting , 2016 .

[6]  Bo Wang,et al.  Multivariate Gaussian and Student-t process regression for multi-output prediction , 2017, Neural Computing and Applications.

[7]  Philipp Koehn,et al.  Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) , 2007 .

[8]  Bernhard Schölkopf,et al.  A Primer on Kernel Methods , 2004 .

[9]  Ernst Wit,et al.  High dimensional Sparse Gaussian Graphical Mixture Model , 2013, ArXiv.

[10]  William T. Ziemba,et al.  Portfolio Selection: Markowitz Mean-variance Model , 2009, Encyclopedia of Optimization.

[11]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[12]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[13]  Dimitris Kugiumtzis,et al.  Detecting Causality in Non-stationary Time Series Using Partial Symbolic Transfer Entropy: Evidence in Financial Data , 2015, Computational Economics.

[14]  Z. He,et al.  On spurious Granger causality , 2001 .

[15]  C. Grebogi,et al.  Inference of Granger causal time-dependent influences in noisy multivariate time series , 2012, Journal of Neuroscience Methods.

[16]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[17]  Federico Tomasi,et al.  Temporal Pattern Detection in Time-Varying Graphical Models , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).

[18]  Alexandre d'Aspremont,et al.  Identifying small mean-reverting portfolios , 2007, ArXiv.

[19]  Stephen P. Boyd,et al.  Network Inference via the Time-Varying Graphical Lasso , 2017, KDD.

[20]  Peter G. Harrison,et al.  Adapting Hidden Markov Models for Online Learning , 2015, UKPEW.

[21]  Pradeep Ravikumar,et al.  Graphical Models via Generalized Linear Models , 2012, NIPS.

[22]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[23]  Kai Chen,et al.  A LSTM-based method for stock returns prediction: A case study of China stock market , 2015, 2015 IEEE International Conference on Big Data (Big Data).

[24]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[25]  Francis Tuerlinckx,et al.  Changing Dynamics: Time-Varying Autoregressive Models Using Generalized Additive Modeling , 2017, Psychological methods.

[26]  Stefan Bauer,et al.  Learning stable and predictive structures in kinetic systems , 2018, Proceedings of the National Academy of Sciences.

[27]  Jr. G. Forney,et al.  Viterbi Algorithm , 1973, Encyclopedia of Machine Learning.

[28]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[29]  Sara Rebagliati,et al.  Pattern recognition using hidden Markov models in financial time series , 2017 .

[30]  H. Messer,et al.  High-order Hidden Markov Models - estimation and implementation , 2009, 2009 IEEE/SP 15th Workshop on Statistical Signal Processing.

[31]  Stephen P. Boyd,et al.  Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data , 2017, KDD.

[32]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[33]  Lourens J. Waldorp,et al.  mgm: Structure Estimation for Time-Varying Mixed Graphical Models in high-dimensional Data , 2015 .

[34]  Yue Huang,et al.  Estimation and testing nonhomogeneity of Hidden Markov model with application in financial time series , 2019 .

[35]  L. Baum,et al.  Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[36]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[37]  Genevera I. Allen,et al.  Graphical Models and Dynamic Latent Factors for Modeling Functional Brain Connectivity , 2019, 2019 IEEE Data Science Workshop (DSW).

[38]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[39]  A. Cohen,et al.  Finite Mixture Distributions , 1982 .

[40]  Emmanuel J. Candès,et al.  Discussion: Latent variable graphical model selection via convex optimization , 2012, ArXiv.

[41]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[42]  Ali Jalali,et al.  On Learning Discrete Graphical Models using Group-Sparse Regularization , 2011, AISTATS.

[43]  Bin Chen,et al.  A Light Gradient Boosting Machine for Remainning Useful Life Estimation of Aircraft Engines , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[44]  Geert Leus,et al.  Online Time-Varying Topology Identification via Prediction-Correction Algorithms , 2020, ArXiv.

[45]  Alexandre Gramfort,et al.  Multivariate Convolutional Sparse Coding for Electromagnetic Brain Signals , 2018, NeurIPS.