Detecting Subdimensional Motifs: An Efficient Algorithm for Generalized Multivariate Pattern Discovery

Discovering recurring patterns in time series data is a fundamental problem for temporal data mining. This paper addresses the problem of locating subdimensional motifs in real-valued, multivariate time series, which requires the simultaneous discovery of sets of recurring patterns along with the corresponding relevant dimensions. While many approaches to motif discovery have been developed, most are restricted to categorical data, univariate time series, or multivariate data in which the temporal patterns span all of the dimensions. In this paper, we present an expected linear-time algorithm that addresses a generalization of multivariate pattern discovery in which each motif may span only a subset of the dimensions. To validate our algorithm, we discuss its theoretical properties and empirically evaluate it using several data sets including synthetic data and motion capture data collected by an on-body iner- tial sensor.

[1]  Padhraic Smyth,et al.  Pattern discovery in sequences under a Markov assumption , 2002, KDD.

[2]  Irfan A. Essa,et al.  Discovering Multivariate Motifs using Subsequence Density Estimation and Greedy Mixture Learning , 2007, AAAI.

[3]  Emilio Bizzi,et al.  Shared and specific muscle synergies in natural motor behaviors. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Jeremy Buhler,et al.  Finding motifs using random projections , 2001, RECOMB.

[5]  Tim Oates,et al.  PERUSE: An unsupervised algorithm for finding recurring patterns in time series , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[6]  P. Smaragdis,et al.  Shift-Invariant Probabilistic Latent Component Analysis , 2007 .

[7]  Irfan A. Essa,et al.  Improving Activity Discovery with Automatic Neighborhood Estimation , 2007, IJCAI.

[8]  Eamonn J. Keogh,et al.  Three Myths about Dynamic Time Warping Data Mining , 2005, SDM.

[9]  Pierre Baldi,et al.  A principled approach to detecting surprising events in video , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Anne M. Denton Kernel-density-based clustering of time series subsequences using a continuous random-walk noise model , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[11]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[12]  Eamonn J. Keogh,et al.  Detecting time series motifs under uniform scaling , 2007, KDD '07.

[13]  Kuniaki Uehara,et al.  Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle , 2005, Machine Learning.

[14]  Reda Alhajj,et al.  Discovering all frequent trends in time series , 2004 .

[15]  R. Mitchell Parry,et al.  BLIND SOURCE SEPARATION USING REPETITIVE STRUCTURE , 2005 .

[16]  Eamonn J. Keogh,et al.  Probabilistic discovery of time series motifs , 2003, KDD '03.

[17]  P. Smaragdis,et al.  Probabilistic Latent Variable Model for Sparse Decompositions of Non-negative Data , 2009 .

[18]  Mark P. Styczynski,et al.  A generic motif discovery algorithm for sequential data. , 2006, Bioinformatics.

[19]  Charles Elkan,et al.  Fitting a Mixture Model By Expectation Maximization To Discover Motifs In Biopolymer , 1994, ISMB.

[20]  Dimitrios I. Fotiadis,et al.  Greedy mixture learning for multiple motif discovery in biological sequences , 2003, Bioinform..

[21]  Tom Armstrong,et al.  Discovering Patterns in Real-Valued Time Series , 2006, PKDD.

[22]  Eamonn J. Keogh,et al.  Mining motifs in massive time series databases , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[23]  Jessica Lin,et al.  Finding Motifs in Time Series , 2002, KDD 2002.

[24]  Irfan A. Essa,et al.  Discovering Characteristic Actions from On-Body Sensor Data , 2006, 2006 10th IEEE International Symposium on Wearable Computers.