Analysis of Subsequence Time-Series Clustering Based on Moving Average

Subsequence time-series clustering (STSC), which consists of subsequence cutout with a sliding window and k-means clustering, had been commonly used in time-series data mining. However, a problem was pointed out that STSC always generates moderate sinusoidal patterns independently of the input. To address this problem, we theoretically explain and empirically confirm the similarity between STSC and moving average. The present analysis is consistent with, and simpler than, one of the most important analyses of STSC. We also question the pattern extraction in the time domain and discuss another solution.

[1]  Tsuyoshi Idé Why Does Subsequence Time-Series Clustering Produce Sine Waves? , 2006, PKDD.

[2]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.

[3]  Jason R. Chen Making clustering in delay-vector space meaningful , 2006, Knowledge and Information Systems.

[4]  George Nagy,et al.  In search of meaning for time series subsequence clustering: matching algorithms based on a new distance measure , 2006, CIKM '06.

[5]  James Glass,et al.  Multi-level acoustic segmentation of continuous speech , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[6]  K.A. Peker Subsequence time series (STS) clustering techniques for meaningful pattern discovery , 2005, International Conference on Integration of Knowledge Intensive Multi-Agent Systems, 2005..

[7]  Michel Verleysen,et al.  Unfolding preprocessing for meaningful time series clustering , 2006, Neural Networks.