Finding Time Series Discords Based on Haar Transform

The problem of finding anomaly has received much attention recently. However, most of the anomaly detection algorithms depend on an explicit definition of anomaly, which may be impossible to elicit from a domain expert. Using discords as anomaly detectors is useful since less parameter setting is required. Keogh et al proposed an efficient method for solving this problem. However, their algorithm requires users to choose the word size for the compression of subsequences. In this paper, we propose an algorithm which can dynamically determine the word size for compression. Our method is based on some properties of the Haar wavelet transformation. Our experiments show that this method is highly effective.

[1]  Eamonn J. Keogh,et al.  Towards parameter-free data mining , 2004, KDD.

[2]  Eamonn J. Keogh,et al.  UCR Time Series Data Mining Archive , 1983 .

[3]  Eamonn J. Keogh,et al.  Approximations to magic: finding unusual medical time series , 2005, 18th IEEE Symposium on Computer-Based Medical Systems (CBMS'05).

[4]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[5]  Eamonn J. Keogh,et al.  Finding surprising patterns in a time series database in linear time and space , 2002, KDD.

[6]  Dipankar Dasgupta,et al.  Novelty detection in time series data using ideas from immunology , 1996 .

[7]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[8]  Cyrus Shahabi,et al.  TSA-tree: a wavelet-based approach to improve the efficiency of multi-level surprise and trend queries on time-series data , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.

[9]  Eamonn J. Keogh,et al.  Making Time-Series Classification More Accurate Using Learned Constraints , 2004, SDM.