GRAB: Finding Time Series Natural Structures via A Novel Graph-based Scheme

In recent years, the widespread use of sensors has substantially stimulated researchers’ interest in time series data mining. Real-world time series often include natural structures. For example, a time series captured from a patient rehabilitation app can be divided into a series of movements, e.g., sitting, standing, and walking. Finding time series natural structures (i.e., latent semantic states) is one of the core subroutines in time series mining applications. However, this task is not trivial as it has two challenges: (1) how to determine the correct change points between consecutive segments, and (2) how to cluster segments into different states.In this paper, we propose a novel graph-based approach, GRAB, to discover time series natural structures. In particular, GRAB first partitions the time series into a set of non-overlapping fragments via the similarity between subsequences. Then, it constructs a fragment-based graph and employs a graph partition method to cluster the fragments into states. Extensive experiments on real-world datasets demonstrate the effectiveness and efficiency of our GRAB method. Specifically, GRAB finds high-quality latent states, and it outperforms state-of-the-art solutions by orders of magnitude.

[1]  Themis Palpanas,et al.  Series2Graph: Graph-based Subsequence Anomaly Detection for Time Series , 2020, Proc. VLDB Endow..

[2]  Eamonn J. Keogh,et al.  Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[3]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[4]  J. Tukey,et al.  An algorithm for the machine calculation of complex Fourier series , 1965 .

[5]  N. Vayatis,et al.  Selective review of offline change point detection methods , 2019 .

[6]  Diane J. Cook,et al.  A survey of methods for time series change point detection , 2017, Knowledge and Information Systems.

[7]  Stephen P. Boyd,et al.  Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data , 2017, KDD.

[8]  Haixun Wang,et al.  Finding semantics in time series , 2011, SIGMOD '11.

[9]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[10]  Jean-Loup Guillaume,et al.  Fast unfolding of communities in large networks , 2008, 0803.0476.

[11]  Eamonn J. Keogh,et al.  Matrix Profile VIII: Domain Agnostic Online Semantic Segmentation at Superhuman Performance Levels , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[12]  Guojie Song,et al.  Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets , 2019, AAAI.

[13]  Holger Dette,et al.  A Likelihood Ratio Approach to Sequential Change Point Detection for a General Class of Parameters , 2018 .

[14]  Christos Faloutsos,et al.  AutoPlait: automatic mining of co-evolving time sequences , 2014, SIGMOD Conference.

[15]  Didier Stricker,et al.  Exploring and extending the boundaries of physical activity recognition , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.