ScrimpCo: scalable matrix profile on commodity heterogeneous processors

The discovery of time series motifs and discords is considered a paramount and challenging problem regarding time series analysis. In this work, we present ScrimpCo, a heterogeneous implementation of a previous algorithm called SCRIMP that excels at finding relevant subsequences in time series. We propose and evaluate several static, dynamic and adaptive partition strategies targeting commodity processors, on both homogeneous (CPU multicore) and heterogeneous (CPU + GPU) architectures. For the CPU + GPU implementation, we explore a heterogeneous parallel_reduce pattern that computes part of the computation onto an OpenCL capable GPU, whereas the CPU cores take care of the other part. Our heterogeneous scheduler, built on top of TBB, pays special attention to appropriately balance the computational load among the GPU and CPU cores. The experimental results show that our homogeneous implementation scales linearly and that our heterogeneous implementation allows us to reach near-ideal performance on commodity processors that feature an on-chip GPU.

[1]  Michael Voss,et al.  Pro TBB , 2019, Apress.

[2]  Amy McGovern,et al.  Identifying predictive multi-dimensional time series motifs: an application to severe weather prediction , 2010, Data Mining and Knowledge Discovery.

[3]  Amir F. Atiya,et al.  An Empirical Comparison of Machine Learning Models for Time Series Forecasting , 2010 .

[4]  Laxmi N. Bhuyan,et al.  A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures , 2013, TACO.

[5]  Eamonn J. Keogh,et al.  Exploiting a novel algorithm and GPUs to break the ten quadrillion pairwise comparisons barrier for time series motifs and joins , 2017, Knowledge and Information Systems.

[6]  Luis Gravano,et al.  k-Shape: Efficient and Accurate Clustering of Time Series , 2015, SIGMOD Conference.

[7]  Gabriel Pfeilschifter,et al.  Time Series Analysis with Matrix Profile on HPC Systems , 2019 .

[8]  Clara E Yoon,et al.  Earthquake detection through computationally efficient similarity search , 2015, Science Advances.

[9]  Oscar Plata,et al.  Accelerating time series motif discovery in the Intel Xeon Phi KNL processor , 2019, The Journal of Supercomputing.

[10]  Evangelos Spiliotis,et al.  The M4 Competition: 100,000 time series and 61 forecasting methods , 2020 .

[11]  Volker Lohweg,et al.  Survey on time series motif discovery , 2017, WIREs Data Mining Knowl. Discov..

[12]  Eamonn J. Keogh,et al.  Matrix Profile II: Exploiting a Novel Algorithm and GPUs to Break the One Hundred Million Barrier for Time Series Motifs and Joins , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[13]  Eamonn J. Keogh,et al.  Matrix Profile XI: SCRIMP++: Time Series Motif Discovery at Interactive Speeds , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[14]  B. Webb,et al.  Searching for motifs in the behaviour of larval Drosophila melanogaster and Caenorhabditis elegans reveals continuity between behavioural states , 2015, Journal of The Royal Society Interface.

[15]  Suhasa B Kodandaramaiah,et al.  Evidence for Long-Timescale Patterns of Synaptic Inputs in CA1 of Awake Behaving Mice , 2017, The Journal of Neuroscience.

[16]  Keshav Pingali,et al.  Adaptive heterogeneous scheduling for integrated GPUs , 2014, 2014 23rd International Conference on Parallel Architecture and Compilation (PACT).

[17]  Themis Palpanas,et al.  The Lernaean Hydra of Data Series Similarity Search: An Experimental Evaluation of the State of the Art , 2018, Proc. VLDB Endow..

[18]  Ramón Beivide,et al.  Simplifying programming and load balancing of data parallel applications on heterogeneous systems , 2016, GPGPU@PPoPP.

[19]  Rafael Asenjo,et al.  Heterogeneous parallel_for Template for CPU–GPU Chips , 2018, International Journal of Parallel Programming.

[20]  R. Govindarajan,et al.  Fluidic Kernels: Cooperative Execution of OpenCL Programs on Multiple Heterogeneous Devices , 2014, CGO '14.

[21]  Michael E. Webber,et al.  Clustering analysis of residential electricity demand profiles , 2014 .

[22]  Eamonn J. Keogh,et al.  Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[23]  Krzysztof Kaczmarski,et al.  Time Series Queries Processing with GPU Support , 2013, ADBIS.

[24]  Mikhail L. Zymbler,et al.  Time Series Discord Discovery on Intel Many-Core Systems , 2019 .

[25]  Alejandro Duran,et al.  Productive Programming of GPU Clusters with OmpSs , 2012, 2012 IEEE 26th International Parallel and Distributed Processing Symposium.

[26]  C. Nelson,et al.  Trends and random walks in macroeconmic time series: Some evidence and implications , 1982 .