Factor Modelling for Clustering High-dimensional Time Series

We propose a new unsupervised learning method for clustering a large number of time series based on a latent factor structure. Each cluster is characterized by its own cluster-specific factors in addition to some common factors which impact on all the time series concerned. Our setting also offers the flexibility that some time series may not belong to any clusters. The consistency with explicit convergence rates is established for the estimation of the common factors, the cluster-specific factors, the latent clusters. Numerical illustration with both simulated data as well as a real data example is also reported. As a spin-off, the proposed new approach also advances significantly the statistical inference for the factor model of Lam and Yao (2012). Keyworks. Autocovariance matrices; Clustering time series; Eigenanalysis; Idiosyncratic

[1]  Elizabeth Ann Maharaj,et al.  Time-Series Clustering , 2015 .

[2]  Gary Chamberlain,et al.  FUNDS, FACTORS, AND DIVERSIFICATION IN ARBITRAGE PRICING MODELS , 1983 .

[3]  Sylvia Kaufmann,et al.  Model-Based Clustering of Multiple Time Series , 2004 .

[4]  Marc Hallin,et al.  The Generalized Dynamic Factor Model. One-Sided Estimation and Forecasting , 2003 .

[5]  Common structure in panels of short ecological time-series , 2000, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[6]  N. Chan,et al.  Factor Modelling for High‐Dimensional Time Series: Inference and Model Selection , 2017 .

[7]  Q. Yao,et al.  High dimensional stochastic regression with latent factors, endogeneity and nonlinearity , 2013, 1310.1990.

[8]  Elizabeth Ann Maharaj,et al.  Time Series Clustering and Classification , 2019 .

[9]  Marco Lippi,et al.  Factor models in high-dimensional time series—A time-domain approach , 2013 .

[10]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[11]  Robert H. Shumway,et al.  Discrimination and Clustering for Multivariate Time Series , 1998 .

[12]  Clifford Lam,et al.  Factor modeling for high-dimensional time series: inference for the number of factors , 2012, 1206.0613.

[13]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[14]  Tomohiro Ando,et al.  Clustering Huge Number of Financial Time Series: A Panel Data Approach With High-Dimensional Predictors and Factor Structures , 2015 .

[15]  Daniel Peña,et al.  Nonstationary dynamic factor analysis , 2006 .

[16]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[17]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2003, Third IEEE International Conference on Data Mining.

[18]  Nuno Constantino Castro,et al.  Time Series Data Mining , 2009, Encyclopedia of Database Systems.

[19]  Jianfeng Yao,et al.  Identifying the number of factors from singular values of a large sample auto-covariance matrix , 2017 .

[20]  George E. P. Box,et al.  Identifying a Simplifying Structure in Time Series , 1987 .

[21]  Ying Wah Teh,et al.  Time-series clustering - A decade review , 2015, Inf. Syst..

[22]  Azadeh Khaleghi,et al.  Consistent Algorithms for Clustering Time Series , 2016, J. Mach. Learn. Res..

[23]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.

[24]  M. Rothschild,et al.  Arbitrage, Factor Structure, and Mean-Variance Analysis on Large Asset Markets , 1982 .

[25]  Andrés M. Alonso,et al.  Clustering time series by linear dependency , 2018, Stat. Comput..

[26]  Ting Zhang,et al.  Clustering High-Dimensional Time Series Based on Parallelism , 2013 .

[27]  Saeed Aghabozorgi,et al.  A Review of Subsequence Time Series Clustering , 2014, TheScientificWorldJournal.