Time series anomaly discovery with grammar-based compression

The problem of anomaly detection in time series has recently received much attention. However, many existing techniques require the user to provide the length of a potential anomaly, which is often unreasonable for real-world problems. In addition, they are also often built upon computing costly distance functions – a procedure that may account for up to 99% of an algorithm’s computation time. Addressing these limitations, we propose two algorithms that use grammar induction to aid anomaly detection without any prior knowledge. Our algorithm discretizes continuous time series values into symbolic form, infers a contextfree grammar, and exploits its hierarchical structure to effectively and efficiently discover algorithmic irregularities that we relate to anomalies. The approach taken is based on the general principle of Kolmogorov complexity where the randomness in a sequence is a function of its algorithmic incompressibility. Since a grammar induction process naturally compresses the input sequence by learning regularities and encoding them compactly with grammar rules, the algorithm’s inability to compress a subsequence indicates its Kolmogorov (algorithmic) randomness and correspondence to an anomaly. We show that our approaches not only allow discovery of multiple variable-length anomalous subsequences at once, but also significantly outperform the current state-of-the-art exact algorithms for time series anomaly detection.

[1]  Eamonn J. Keogh,et al.  Mining motifs in massive time series databases , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[2]  Jessica Lin,et al.  Finding Motifs in Time Series , 2002, KDD 2002.

[3]  Li Wei,et al.  SAXually Explicit Images: Finding Unusual Shapes , 2006, Sixth International Conference on Data Mining (ICDM'06).

[4]  Marie Ferbus-Zanda,et al.  Is Randomness "Native" to Computer Science? , 2008, Bull. EATCS.

[5]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part I , 1964, Inf. Control..

[6]  KeoghEamonn,et al.  Disk aware discord discovery: finding unusual time series in terabyte sized datasets , 2008 .

[7]  Jarke J. van Wijk,et al.  Cluster and Calendar Based Visualization of Time Series Data , 1999, INFOVIS.

[8]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[9]  Ray J. Solomonoff,et al.  A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..

[10]  Varun Chandola,et al.  TR 09-004 Detecting Anomalies in a Time Series Database , 2009 .

[11]  Per Martin-Löf,et al.  The Definition of Random Sequences , 1966, Inf. Control..

[12]  Li Wei,et al.  Assumption-Free Anomaly Detection in Time Series , 2005, SSDBM.

[13]  A. Madansky Identification of Outliers , 1988 .

[14]  Tim Oates,et al.  GrammarViz 2.0: A Tool for Grammar-Based Pattern Discovery in Time Series , 2014, ECML/PKDD.

[15]  Xiao-yun Chen,et al.  Multi-scale anomaly detection algorithm based on infrequent pattern of time series , 2008 .

[16]  Ian H. Witten,et al.  Identifying Hierarchical Structure in Sequences: A linear-time algorithm , 1997, J. Artif. Intell. Res..

[17]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[18]  Charu C. Aggarwal,et al.  Outlier Detection for Temporal Data: A Survey , 2014, IEEE Transactions on Knowledge and Data Engineering.

[19]  Jessica Lin,et al.  Visually mining and monitoring massive time series , 2004, KDD.

[20]  Tim Oates,et al.  Motif discovery in spatial trajectories using grammar inference , 2013, CIKM.

[21]  Michael R. Powers,et al.  What Is Randomness , 2014 .

[22]  Paul M. B. Vitányi,et al.  Clustering by compression , 2003, IEEE Transactions on Information Theory.

[23]  Eamonn J. Keogh,et al.  HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[24]  Prabhakar Raghavan,et al.  A Linear Method for Deviation Detection in Large Databases , 1996, KDD.

[25]  Eamonn J. Keogh,et al.  Finding Time Series Discords Based on Haar Transform , 2006, ADMA.

[26]  Eamonn J. Keogh,et al.  Towards parameter-free data mining , 2004, KDD.

[27]  Jian Pei,et al.  WAT: Finding Top-K Discords in Time Series Database , 2007, SDM.

[28]  Wei Shen,et al.  Data Mining and Applications , 2001 .

[29]  Ian H. Witten,et al.  Linear-time, incremental hierarchy inference for compression , 1997, Proceedings DCC '97. Data Compression Conference.

[30]  Ming Li,et al.  An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.

[31]  Jorma Rissanen,et al.  Minimum Description Length Principle , 2010, Encyclopedia of Machine Learning.

[32]  Tim Oates,et al.  Visualizing Variable-Length Time Series Motifs , 2012, SDM.