Matrix Profile VI: Meaningful Multidimensional Motif Discovery

Time series motifs are approximately repeating patterns in real-valued time series data. They are useful for exploratory data mining and are often used as inputs for various time series clustering, classification, segmentation, rule discovery, and visualization algorithms. Since the introduction of the first motif discovery algorithm for univariate time series in 2002, multiple efforts have been made to generalize motifs to the multidimensional case. In this work, we show that these efforts, which typically attempt to find motifs on all dimensions, will not produce meaningful motifs except in the most contrived situations. We explain this finding and introduce mSTAMP, an algorithm that allows meaningful discovery of multidimensional motifs. Beyond producing objectively and subjectively meaningful results, our algorithm has a host of additional advantages, including being much faster, requiring fewer parameters and supporting streaming data. We demonstrate the utility of our mSTAMP-based motif discovery framework on domains as diverse as audio processing, industry, and sports analytics.

[1]  Eamonn J. Keogh,et al.  Matrix Profile II: Exploiting a Novel Algorithm and GPUs to Break the One Hundred Million Barrier for Time Series Motifs and Joins , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[2]  Claire Cardie,et al.  Proceedings of the Eighteenth International Conference on Machine Learning, 2001, p. 577–584. Constrained K-means Clustering with Background Knowledge , 2022 .

[3]  Eamonn J. Keogh,et al.  Classification of Multi-dimensional Streaming Time Series by Weighting Each Classifier's Track Record , 2013, 2013 IEEE 13th International Conference on Data Mining.

[4]  Kuniaki Uehara,et al.  Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle , 2005, Machine Learning.

[5]  W. Guntheroth,et al.  Effect of Respiration on Venous Return and Stroke Volume in Cardiac Tamponade: Mechanism Of Pulsus Paradoxus , 1967, Circulation research.

[6]  Michael O'Neill,et al.  The Use of Mel-frequency Cepstral Coefficients in Musical Instrument Identification , 2008, ICMC.

[7]  Vasileios Exadaktylos,et al.  Time-series analysis for online recognition and localization of sick pig (Sus scrofa) cough sounds. , 2008, The Journal of the Acoustical Society of America.

[8]  Eamonn J. Keogh,et al.  Mining motifs in massive time series databases , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[9]  Steven K. Firth,et al.  A data management platform for personalised real-time energy feedback , 2015 .

[10]  Eamonn J. Keogh,et al.  SiMPle: Assessing Music Similarity Using Subsequences Joins , 2016, ISMIR.

[11]  Vishal Kumar,et al.  Segmenting music library for generation of playlist using machine learning , 2015, 2015 IEEE International Conference on Electro/Information Technology (EIT).

[12]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[13]  Eamonn J. Keogh,et al.  Exact Discovery of Time Series Motifs , 2009, SDM.

[14]  Eamonn J. Keogh,et al.  Matrix Profile III: The Matrix Profile Allows Visualization of Salient Subsequences in Massive Time Series , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[15]  Irfan A. Essa,et al.  Detecting Subdimensional Motifs: An Efficient Algorithm for Generalized Multivariate Pattern Discovery , 2007, Seventh IEEE International Conference on Data Mining (ICDM 2007).

[16]  John Hart,et al.  ACM Transactions on Graphics: Editorial , 2003, SIGGRAPH 2003.

[17]  Philippe Beaudoin,et al.  Motion-motif graphs , 2008, SCA '08.

[18]  Philip Chan,et al.  Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[19]  Kristof Van Laerhoven,et al.  Detecting leisure activities with dense motif discovery , 2012, UbiComp.

[20]  Lucas Kovar,et al.  Motion Graphs , 2002, ACM Trans. Graph..

[21]  R. L. Thorndike Who belongs in the family? , 1953 .

[22]  Eamonn J. Keogh,et al.  Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View That Includes Motifs, Discords and Shapelets , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[23]  Jun Wang,et al.  Discovering Multidimensional Motifs in Physiological Signals for Personalized Healthcare , 2016, IEEE Journal of Selected Topics in Signal Processing.

[24]  Majid Sarrafzadeh,et al.  Toward Unsupervised Activity Discovery Using Multi-Dimensional Motif Detection in Time Series , 2009, IJCAI.

[25]  Didier Stricker,et al.  Introducing a New Benchmarked Dataset for Activity Monitoring , 2012, 2012 16th International Symposium on Wearable Computers.