An evolutionary approach to pattern-based time series segmentation

Time series data, due to their numerical and continuous nature, are difficult to process, analyze, and mine. However, these tasks become easier when the data can be transformed into meaningful symbols. Most recent works on time series only address how to identify a given pattern from a time series and do not consider the problem of identifying a suitable set of time points for segmenting the time series in accordance with a given set of pattern templates (e.g., a set of technical patterns for stock analysis). However, the use of fixed-length segmentation is an oversimplified approach to this problem; hence, a dynamic approach (with high controllability) is preferable so that the time series can be segmented flexibly and effectively according to the needs of the users and the applications. In view of the fact that this segmentation problem is an optimization problem and evolutionary computation is an appropriate tool to solve it, we propose an evolutionary time series segmentation algorithm. This approach allows a sizeable set of pattern templates to be generated for mining or query. In addition, defining similarity between time series (or time series segments) is of fundamental importance in fitness computation. By identifying the perceptually important points directly from the time domain, time series segments and templates of different lengths can be compared and intuitive pattern matching can be carried out in an effective and efficient manner. Encouraging experimental results are reported from tests that segment both artificial time series generated from the combinations of pattern templates and the time series of selected Hong Kong stocks.

[1]  G. F. Bryant,et al.  A solution to the segmentation problem based on dynamic programming , 1994, 1994 Proceedings of IEEE International Conference on Control and Applications.

[2]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[3]  Padhraic Smyth,et al.  Deformable Markov model templates for time-series pattern matching , 2000, KDD '00.

[4]  Hagit Shatkay,et al.  Approximate queries and representations for large data sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[5]  Jiawei Han,et al.  Mining Segment-Wise Periodic Patterns in Time-Related Databases , 1998, KDD.

[6]  Tak-chung Fu,et al.  Flexible time series pattern matching based on perceptually important points , 2001 .

[7]  G. F. Bryant,et al.  A new algorithm for segmenting data from time series , 1996, Proceedings of 35th IEEE Conference on Decision and Control.

[8]  José Carlos Príncipe,et al.  A neighborhood map of competing one step predictors for piecewise segmentation and identification of time series , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[9]  Thomas Bäck,et al.  Evolutionary computation: comments on the history and current state , 1997, IEEE Trans. Evol. Comput..

[10]  Xin Yao,et al.  A new evolutionary approach to cutting stock problems with and without contiguity , 2002, Comput. Oper. Res..

[11]  Jaideep Srivastava,et al.  Event detection from time series data , 1999, KDD '99.

[12]  Heikki Mannila,et al.  Rule Discovery from Time Series , 1998, KDD.

[13]  Ashok N. Srivastava,et al.  Data Mining for Features Using Scale-Sensitive Gated Experts , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Jonathan J. Oliver,et al.  Bayesian Approaches to Segmenting A Simple Time Series , 1997 .

[15]  Fu-Lai Chung,et al.  Evolutionary segmentation of financial time series into subsequences , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[16]  Thomas Bäck,et al.  Evolutionary computation: an overview , 1996, Proceedings of IEEE International Conference on Evolutionary Computation.

[17]  Eamonn J. Keogh,et al.  An online algorithm for segmenting time series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[18]  Pin Zhou,et al.  A new approach to transforming time series into symbolic sequences , 1999, Proceedings of the First Joint BMES/EMBS Conference. 1999 IEEE Engineering in Medicine and Biology 21st Annual Conference and the 1999 Annual Fall Meeting of the Biomedical Engineering Society (Cat. N.

[19]  Haixun Wang,et al.  Landmarks: a new model for similarity-based pattern querying in time series databases , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[20]  C. S. Wallace,et al.  Minimum Message Length Segmentation , 1998, PAKDD.