Effect of segmentation on financial time series pattern matching

Graphical abstractDisplay Omitted HighlightsWe evaluate financial time series pattern matching and segmentation methods.PIP, PAA, PLA, and TP segmentation methods are analyzed.TB, RB, HY, DT, and SAX pattern matching approaches are evaluated.PIP achieves better performance and is especially superior when used with RB and HY. In financial time series pattern matching, segmentation is often performed as a pre-processing step to reduce the data points from the input sequence. The segmentation process extracts important data points and produces a time series with reduced data points. In this paper, we evaluate the effectiveness and accuracy of four approaches to financial time series pattern matching when used with four segmentation methods, the perceptually important points, piecewise aggregate approximation, piecewise linear approximation and turning points methods. The pattern matching approaches analysed in this paper include the template-based, rule-based, hybrid, decision tree, and Symbolic Aggregate approXimation (SAX) approaches. The analysis is performed twice, on a real data set (of Hang Seng Index prices from the Hong Kong stock market) and on a synthetic data set containing positive and negative cases of a technical pattern known as head-and-shoulders.

[1]  Eamonn J. Keogh,et al.  A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases , 2000, PAKDD.

[2]  Michael K. Ng,et al.  Higher-order multivariate Markov chains and their applications , 2008 .

[3]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[4]  Jiangling Yin,et al.  OBST-based segmentation approach to financial time series , 2013, Eng. Appl. Artif. Intell..

[5]  Nguyen Quoc Viet Hung,et al.  Combining SAX and Piecewise Linear Approximation to Improve Similarity Search on Financial Time Series , 2007, 2007 International Symposium on Information Technology Convergence (ISITC 2007).

[6]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[7]  Choon Hui Teo,et al.  Fast and space efficient string kernels using suffix arrays , 2006, ICML.

[8]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[9]  Enno Ohlebusch,et al.  Replacing suffix trees with enhanced suffix arrays , 2004, J. Discrete Algorithms.

[10]  Achilleas Zapranis,et al.  Identification of the Head-and-Shoulders Technical Analysis Pattern with Neural Networks , 2010, ICANN.

[11]  Hans-Peter Seidel,et al.  A General Framework for Mesh Decimation , 1998, Graphics Interface.

[12]  Jian Pei,et al.  A brief survey on sequence classification , 2010, SKDD.

[13]  Thomas N. Bulkowski Encyclopedia of Chart Patterns , 2000 .

[14]  Alexander J. Smola,et al.  Fast Kernels for String and Tree Matching , 2002, NIPS.

[15]  Padhraic Smyth,et al.  Deformable Markov model templates for time-series pattern matching , 2000, KDD '00.

[16]  Ahmed Ghorbel,et al.  A survey of control-chart pattern-recognition literature (1991-2010) based on a new conceptual classification scheme , 2012, Comput. Ind. Eng..

[17]  José Carlos Príncipe,et al.  Time Series Segmentation Using a Novel Adaptive Eigendecomposition Algorithm , 2002, J. VLSI Signal Process..

[18]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[19]  Wang Yuanzhen,et al.  Early abandon to accelerate exact dynamic time warping , 2009, Int. Arab J. Inf. Technol..

[20]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[21]  Rui Zhang,et al.  A real time hybrid pattern matching scheme for stock time series , 2010, ADC.

[22]  Paul R. Cohen,et al.  Bayesian Clustering by Dynamics Contents 1 Introduction 1 2 Clustering Markov Chains 2 , 2022 .

[23]  Chonghui Guo,et al.  Similarity measure based on piecewise linear approximation and derivative dynamic time warping for time series mining , 2011, Expert Syst. Appl..

[24]  Eamonn J. Keogh,et al.  An online algorithm for segmenting time series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[25]  Azuraliza Abu Bakar,et al.  Enhanced symbolic aggregate approximation (EN-SAX) as an improved representation method for financial time series data , 2013 .

[26]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[27]  Tzung-Pei Hong,et al.  Time series pattern discovery by a PIP-based evolutionary approach , 2013, Soft Comput..

[28]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.

[29]  Tak-Chung Fu,et al.  Stock time series pattern matching: Template-based vs. rule-based approaches , 2007, Eng. Appl. Artif. Intell..

[30]  Mi Zhou,et al.  A geometrical solution to time series searching invariant to shifting and scaling , 2005, Knowledge and Information Systems.

[31]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[32]  Eamonn J. Keogh,et al.  DTW-D: time series semi-supervised learning from a single example , 2013, KDD.

[33]  Heikki Hyyrö,et al.  A Bit-Vector Algorithm for Computing Levenshtein and Damerau Edit Distances , 2003, Nord. J. Comput..

[34]  Tak-chung Fu,et al.  Flexible time series pattern matching based on perceptually important points , 2001 .

[35]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[36]  Andrew W. Lo,et al.  Foundations of Technical Analysis: Computational Algorithms, Statistical Inference, and Empirical Implementation , 2000 .