Developing a pattern discovery method in time series data and its GPU acceleration

The Dynamic Time Warping (DTW) algorithm is widely used in finding the global alignment of time series. Many time series data mining and analytical problems can be solved by the DTW algorithm. However, using the DTW algorithm to find similar subsequences is computationally expensive or unable to perform accurate analysis. Hence, in the literature, the parallelisation technique is used to speed up the DTW algorithm. However, due to the nature of DTW algorithm, parallelising this algorithm remains an open challenge. In this paper, we first propose a novel method that finds the similar local subsequence. Our algorithm first searches for the possible start positions of subsequence, and then finds the best-matching alignment from these positions. Moreover, we parallelise the proposed algorithm on GPUs using CUDA and further propose an optimisation technique to improve the performance of our parallelization implementation on GPU. We conducted the extensive experiments to evaluate the proposed method. Experimental results demonstrate that the proposed algorithm is able to discover time series subsequences efficiently and that the proposed GPU-based parallelization technique can further speedup the processing.

[1]  Machiko Toyoda,et al.  Discovery of cross-similarity in data streams , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[2]  Eamonn J. Keogh,et al.  Accelerating Dynamic Time Warping Subsequence Search with GPUs and FPGAs , 2010, 2010 IEEE International Conference on Data Mining.

[3]  Tomoyuki Hiroyasu,et al.  Similar subsequence retrieval from two time series data using homology search , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[4]  Ya-Ju Fan,et al.  Finding Motifs in Wind Generation Time Series Data , 2012, 2012 11th International Conference on Machine Learning and Applications.

[5]  Konstantinos Kalpakis,et al.  Distance measures for effective clustering of ARIMA time-series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[6]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[7]  Eamonn J. Keogh,et al.  An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback , 1998, KDD.

[8]  Christos Faloutsos,et al.  Stream Monitoring under the Time Warping Distance , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[9]  Yasushi Sakurai,et al.  A Parallelized Data Stream Processing System Using Dynamic Time Warping Distance , 2009, 2009 International Conference on Complex, Intelligent and Software Intensive Systems.

[10]  Dipankar Dasgupta,et al.  Novelty detection in time series data using ideas from immunology , 1996 .

[11]  Chih-Ping Wei,et al.  Discovery of temporal patterns from process instances , 2004, Comput. Ind..

[12]  Carlos Agón,et al.  Time-series data mining , 2012, CSUR.

[13]  Arvind Kumar,et al.  Implementing the dynamic time warping algorithm in multithreaded environments for real time and unsupervised pattern discovery , 2011, 2011 2nd International Conference on Computer and Communication Technology (ICCCT-2011).

[14]  Chang-Tsun Li,et al.  Dynamic Image-to-Class Warping for Occluded Face Recognition , 2014, IEEE Transactions on Information Forensics and Security.

[15]  Eamonn J. Keogh,et al.  Probabilistic discovery of time series motifs , 2003, KDD '03.

[16]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[17]  Limin Xiao,et al.  Parallelizing Dynamic Time Warping Algorithm Using Prefix Computations on GPU , 2013, 2013 IEEE 10th International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing.

[18]  Machiko Toyoda,et al.  Pattern discovery in data streams under the time warping distance , 2012, The VLDB Journal.

[19]  Heikki Mannila,et al.  Rule Discovery from Time Series , 1998, KDD.

[20]  Eamonn J. Keogh,et al.  Exact Discovery of Time Series Motifs , 2009, SDM.

[21]  James R. Glass,et al.  Fast spoken query detection using lower-bound Dynamic Time Warping on Graphical Processing Units , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[23]  Majid Sarrafzadeh,et al.  Toward Unsupervised Activity Discovery Using Multi-Dimensional Motif Detection in Time Series , 2009, IJCAI.

[24]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[25]  Markus Hegland,et al.  Mining the MACHO dataset , 2001 .

[26]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[27]  Padhraic Smyth,et al.  Deformable Markov model templates for time-series pattern matching , 2000, KDD '00.

[28]  Radomir S. Stankovic,et al.  The Haar wavelet transform: its status and achievements , 2003, Comput. Electr. Eng..

[29]  Chang-Tsun Li,et al.  An unsupervised conditional random fields approach for clustering gene expression time series , 2008, Bioinform..