Accelerating Dynamic Time Warping Subsequence Search with GPUs and FPGAs

Many time series data mining problems require subsequence similarity search as a subroutine. Dozens of similarity/distance measures have been proposed in the last decade and there is increasing evidence that Dynamic Time Warping (DTW) is the best measure across a wide range of domains. Given DTW’s usefulness and ubiquity, there has been a large community-wide effort to mitigate its relative lethargy. Proposed speedup techniques include early abandoning strategies, lower-bound based pruning, indexing and embedding. In this work we argue that we are now close to exhausting all possible speedup from software, and that we must turn to hardware-based solutions. With this motivation, we investigate both GPU (Graphics Processing Unit) and FPGA (Field Programmable Gate Array) based acceleration of subsequence similarity search under the DTW measure. As we shall show, our novel algorithms allow GPUs to achieve two orders of magnitude speedup and FPGAs to produce four orders of magnitude speedup. We conduct detailed case studies on the classification of astronomical observations and demonstrate that our ideas allow us to tackle problems that would be untenable otherwise.

[1]  Adrian Park,et al.  Designing Modular Hardware Accelerators in C with ROCCC 2.0 , 2010, 2010 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines.

[2]  Pavlos Protopapas,et al.  Finding Anomalous Periodic Time Series: An Application to Catalogs of Periodic Variable Stars , 2009, arXiv.org.

[3]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[4]  Victor B. Zordan,et al.  A Dynamics-based Comparison Metric for Motion Graphs , 2022 .

[5]  Walid A. Najjar,et al.  Compiler generated systolic arrays for wavefront algorithm acceleration on FPGAs , 2008, 2008 International Conference on Field Programmable Logic and Applications.

[6]  Vit Niennattrakul,et al.  Meaningful Subsequence Matching under Time Warping Distance for Data Stream , 2009, PAKDD.

[7]  Li Wei,et al.  Fast time series classification using numerosity reduction , 2006, ICML.

[8]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[9]  Christos Faloutsos,et al.  Stream Monitoring under the Time Warping Distance , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  M.A. Mneimneh,et al.  An adaptive kalman filter for removing baseline wandering in ECG signals , 2006, 2006 Computers in Cardiology.

[11]  Reinhard Männer,et al.  Using floating-point arithmetic on FPGAs to accelerate scientific N-Body simulations , 2002, Proceedings. 10th Annual IEEE Symposium on Field-Programmable Custom Computing Machines.

[12]  R. Groves,et al.  SEASONAL INFECTIVITY OF ASTER LEAFHOPPERS IN CARROT , 2010 .

[13]  Pavlos Protopapas,et al.  Finding anomalous periodic time series , 2009, Machine Learning.

[14]  D. L. McLEAN,et al.  A Technique for Electronically Recording Aphid Feeding and Salivation , 1964, Nature.

[15]  P. Protopapas,et al.  Finding outlier light curves in catalogues of periodic variable stars , 2005, astro-ph/0505495.

[16]  E. Backus,et al.  The AC-DC correlation monitor: New EPG design with flexible input resistors to detect both R and emf components for any piercing-sucking hemipteran. , 2009, Journal of insect physiology.

[17]  Naga K. Govindaraju,et al.  High performance discrete Fourier transforms on graphics processors , 2008, 2008 SC - International Conference for High Performance Computing, Networking, Storage and Analysis.

[18]  Dennis Shasha,et al.  Warping indexes with envelope transforms for query by humming , 2003, SIGMOD '03.

[19]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[20]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[21]  Varun Chandola,et al.  TR 09-004 Detecting Anomalies in a Time Series Database , 2009 .

[22]  Tony R. Martinez,et al.  Instance Pruning Techniques , 1997, ICML.

[23]  Li Wei,et al.  Fast Best-Match Shape Searching in Rotation-Invariant Metric Spaces , 2008, IEEE Transactions on Multimedia.

[24]  Dimitrios Gunopulos,et al.  Approximate embedding-based subsequence matching of time series , 2008, SIGMOD Conference.

[25]  W. F. Tjallingii,et al.  Electrical penetration graphs of thrips revised: combining DC- and AC-EPG signals. , 2006, Journal of insect physiology.

[26]  Gang Chen,et al.  Efficient Processing of Warping Time Series Join of Motion Capture Data , 2009, 2009 IEEE 25th International Conference on Data Engineering.

[27]  Bingsheng He,et al.  Relational joins on graphics processors , 2008, SIGMOD Conference.

[28]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[29]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[30]  Eamonn J. Keogh,et al.  Atomic wedgie: efficient query filtering for streaming time series , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[31]  Eamonn J. Keogh,et al.  Finding Time Series Motifs in Disk-Resident Data , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[32]  Doheon Lee,et al.  Identification of temporal association rules from time-series microarray data sets , 2009, DTMBIO '08.

[33]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.