Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data
Abstract:Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. For example, raw sensor data from a fitness-tracking application can be expressed as a timeline of a select few actions (i.e., walking, sitting, running). However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Furthermore, interpreting the resulting clusters is difficult, especially when the data is high-dimensional. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through alternating minimization, using a variation of the expectation maximization (EM) algorithm. We derive closed-form solutions to efficiently solve the two resulting subproblems in a scalable way, through dynamic programming and the alternating direction method of multipliers (ADMM), respectively. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile sensor dataset how TICC can be used to learn interpretable clusters in real-world scenarios.
暂无分享,去 创建一个
[1] Pradeep Ravikumar,et al. QUIC: quadratic approximation for sparse inverse covariance estimation , 2014, J. Mach. Learn. Res..
[2] Eamonn J. Keogh,et al. Scaling up dynamic time warping for datamining applications , 2000, KDD '00.
[3] Nir Friedman,et al. Probabilistic Graphical Models - Principles and Techniques , 2009 .
[4] Stephen P. Boyd,et al. Conic Optimization via Operator Splitting and Homogeneous Self-Dual Embedding , 2013, Journal of Optimization Theory and Applications.
[5] Alexandre d'Aspremont,et al. Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .
[6] Andrew J. Viterbi,et al. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.
[7] U. Brandes. A faster algorithm for betweenness centrality , 2001 .
[8] Johannes Peltola,et al. Activity classification using realistic data from wearable sensors , 2006, IEEE Transactions on Information Technology in Biomedicine.
[9] Martin J. Wainwright,et al. Log-determinant relaxation for approximate inference in discrete Markov random fields , 2006, IEEE Transactions on Signal Processing.
[10] Eric R. Ziegel,et al. The Elements of Statistical Learning , 2003, Technometrics.
[11] Eamonn J. Keogh,et al. Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy , 2015, KDD.
[12] J. Zico Kolter,et al. Sparse Gaussian Conditional Random Fields: Algorithms, Theory, and Application to Energy Forecasting , 2013, ICML.
[13] Eamonn J. Keogh,et al. Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.
[14] Dit-Yan Yeung,et al. Time series clustering with ARMA mixtures , 2004, Pattern Recognit..
[15] Thomas Martinetz,et al. 'Neural-gas' network for vector quantization and its application to time-series prediction , 1993, IEEE Trans. Neural Networks.
[16] A. H. Shirazi,et al. Network analysis of a financial market based on genuine correlation and threshold method , 2011 .
[17] Patrick Danaher,et al. The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.
[18] Aristides Gionis,et al. Finding recurrent sources in sequences , 2003, RECOMB '03.
[19] Fabian Mörchen,et al. Extracting interpretable muscle activation patterns with time series knowledge mining , 2005, Int. J. Knowl. Based Intell. Eng. Syst..
[20] Eamonn J. Keogh,et al. A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.
[21] Heikki Mannila,et al. Rule Discovery from Time Series , 1998, KDD.
[22] P. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .
[23] Tim Oates,et al. Visualizing Variable-Length Time Series Motifs , 2012, SDM.
[24] Marco Cuturi,et al. Fast Global Alignment Kernels , 2011, ICML.
[25] Stephen P. Boyd,et al. Greedy Gaussian segmentation of multivariate time series , 2016, Advances in Data Analysis and Classification.
[26] Stephen P. Boyd,et al. Proximal Algorithms , 2013, Found. Trends Optim..
[27] Kazuya Takeda,et al. Driver Modeling Based on Driving Behavior and Its Evaluation in Driver Identification , 2007, Proceedings of the IEEE.
[28] Padhraic Smyth,et al. Clustering Sequences with Hidden Markov Models , 1996, NIPS.
[29] Michael I. Jordan. Graphical Models , 2003 .
[30] Thomas M. Cover,et al. Elements of Information Theory , 2005 .
[31] Evgenia Dimitriadou. Convex Clustering Methods and Clustering Indexes , 2015 .
[32] Eamonn Keogh. Exact Indexing of Dynamic Time Warping , 2002, VLDB.
[33] A. Raftery,et al. Model-based Gaussian and non-Gaussian clustering , 1993 .
[34] Saeed Aghabozorgi,et al. A Review of Subsequence Time Series Clustering , 2014, TheScientificWorldJournal.
[35] Adrian E. Raftery,et al. MCLUST Version 3: An R Package for Normal Mixture Modeling and Model-Based Clustering , 2006 .
[36] Donald J. Berndt,et al. Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.
[37] M. Hestenes. Multiplier and gradient methods , 1969 .
[38] Eamonn J. Keogh,et al. Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.
[39] Pradeep Ravikumar,et al. BIG & QUIC: Sparse Inverse Covariance Estimation for a Million Variables , 2013, NIPS.
[40] Robert M. Gray,et al. Toeplitz and Circulant Matrices: A Review , 2005, Found. Trends Commun. Inf. Theory.
[41] Robert M. Gray,et al. Toeplitz And Circulant Matrices: A Review (Foundations and Trends(R) in Communications and Information Theory) , 2006 .
[42] Li Wei,et al. Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.
[43] R. Tibshirani,et al. Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.
[44] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..
[45] Heikki Mannila,et al. Time series segmentation for context recognition in mobile devices , 2001, Proceedings 2001 IEEE International Conference on Data Mining.
[46] Stephen P. Boyd,et al. Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.
[47] M. Yuan,et al. Model selection and estimation in regression with grouped variables , 2006 .
[48] Leonhard Held,et al. Gaussian Markov Random Fields: Theory and Applications , 2005 .
[49] Su-In Lee,et al. Node-based learning of multiple Gaussian graphical models , 2013, J. Mach. Learn. Res..