Matrix Profile XIX: Time Series Semantic Motifs: A New Primitive for Finding Higher-Level Structure in Time Series

Time series motifs are approximately repeated patterns in real-valued temporal data. They are used for exploratory data mining methods including clustering, classification, segmentation, and rule discovery. Their current definition is limited to finding literal or near-exact matches and is unable to discover higher level semantic structure. Consider a time series generated by an accelerometer on a smartwatch. This data offers the possibility of finding motifs in human behavior. One such example is the motif generated by a handshake. Under current motif definitions, a single-pump handshake would not match a three-pump handshake, even though they are culturally and semantically equivalent events. In this work we generalize the definition of motifs to one which allows us to capture higher level semantic structure. We refer to these as time series semantic motifs. Surprisingly this increased expressiveness does not come at a great cost. Our algorithm Semantic-Motif-Finder takes approximately the same time as current state-of-the-art motif discovery algorithms. Furthermore, we demonstrate the utility of our ideas on diverse datasets.

[1]  Eamonn J. Keogh,et al.  Time Series Classification to Improve Poultry Welfare , 2018, 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA).

[2]  Monique A Ladds,et al.  Seeing It All: Evaluating Supervised Machine Learning Methods for the Classification of Diverse Otariid Behaviours , 2016, PloS one.

[3]  Majid Sarrafzadeh,et al.  Toward Unsupervised Activity Discovery Using Multi-Dimensional Motif Detection in Time Series , 2009, IJCAI.

[4]  Eamonn J. Keogh,et al.  An ultra-fast time series distance measure to allow data mining in more complex real-world deployments , 2020, Data Mining and Knowledge Discovery.

[5]  Yifeng Gao,et al.  Exploring variable-length time series motifs in one hundred million length scale , 2018, Data Mining and Knowledge Discovery.

[6]  Duong Tuan Anh,et al.  A novel clustering-based method for time series motif discovery under time warping measure , 2017, International Journal of Data Science and Analytics.

[7]  Denis S. Willett,et al.  Machine Learning for Characterization of Insect Vector Feeding , 2016, PLoS Comput. Biol..

[8]  Yifeng Gao,et al.  Efficient discovery of time series motifs with large length range in million scale time series , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[9]  Kristof Van Laerhoven,et al.  Detecting leisure activities with dense motif discovery , 2012, UbiComp.

[10]  GaoYifeng,et al.  Exploring variable-length time series motifs in one hundred million length scale , 2018 .

[11]  Eamonn J. Keogh,et al.  Matrix Profile XIII: Time Series Snippets: A New Primitive for Time Series Data Mining , 2018, 2018 IEEE International Conference on Big Knowledge (ICBK).

[12]  David P. Hocking,et al.  Australian Fur Seals (Arctocephalus pusillus doriferus) Use Raptorial Biting and Suction Feeding When Targeting Prey in Different Foraging Scenarios , 2014, PloS one.

[13]  Eamonn J. Keogh,et al.  Time series joins, motifs, discords and shapelets: a unifying view that exploits the matrix profile , 2017, Data Mining and Knowledge Discovery.

[14]  Siddhartha S. Srinivasa,et al.  Food manipulation: A cadence of haptic signals , 2018, ArXiv.

[15]  Eamonn J. Keogh,et al.  Matrix Profile XII: MPdist: A Novel Time Series Distance Measure to Allow Data Mining in More Challenging Scenarios , 2018, 2018 IEEE International Conference on Data Mining (ICDM).

[16]  S. Nowicki,et al.  Typical versions of learned swamp sparrow song types are more effective signals than are less typical versions , 2014, Proceedings of the Royal Society B: Biological Sciences.

[17]  Jun Wang,et al.  Discovering Multidimensional Motifs in Physiological Signals for Personalized Healthcare , 2016, IEEE Journal of Selected Topics in Signal Processing.

[18]  Eamonn Keogh,et al.  Putting the Human in the Time Series Analytics Loop , 2019, WWW.

[19]  V. Stanković,et al.  An electrical load measurements dataset of United Kingdom households from a two-year longitudinal study , 2017, Scientific Data.

[20]  Eamonn J. Keogh,et al.  SiMPle: Assessing Music Similarity Using Subsequences Joins , 2016, ISMIR.

[21]  Eamonn J. Keogh,et al.  Probabilistic discovery of time series motifs , 2003, KDD '03.

[22]  Gaurav S. Sukhatme,et al.  Coarse In-Building Localization with Smartphones , 2009, MobiCASE.