A real time hybrid pattern matching scheme for stock time series

Pattern matching in stock time series is an active research area in data mining. We propose a new real-time hybrid pattern-matching algorithm in this paper. The algorithm is based on the Spearman's rank correlation, rule sets and sliding window. The concept of sliding windows enables patterns matching to be performed only based on subsequence of stock data which are freshly received. Therefore the proposed algorithm can be applied in real-time application and processing time can be reduced. Spearman's rank correlation coefficient is used to classify the preferred patterns effectively and efficiently first and use the rule sets to provide further ability for describing the query patterns so that is more effective, sensitive and constrainable in distinguishing individual patterns. Encouraging experiment is reported from the tests that the proposed scheme outperforms the other methods both effectively and efficiently, especially in differentiating the special preferred stock patterns or even distorted patterns.

[1]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[2]  Dina Q. Goldin,et al.  On Similarity Queries for Time-Series Data: Constraint Specification and Implementation , 1995, CP.

[3]  Toshiyuki Amagasa,et al.  The L - index: An indexing structure for ecient subsequence matching in time sequence databases , 2001 .

[4]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[5]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[6]  Zbigniew R. Struzik,et al.  The Haar Wavelet Transform in the Time Series Similarity Paradigm , 1999, PKDD.

[7]  Durga Toshniwal,et al.  Similarity Search in Time Series Data Using Time Weighted Slopes , 2005, Informatica.

[8]  Man Hon Wong,et al.  Efficient and robust feature extraction and pattern matching of time series by a lattice structure , 2001, CIKM '01.

[9]  Hagit Shatkay,et al.  Approximate queries and representations for large data sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[10]  Donald J. Berndt,et al.  Using Dynamic Time Warping to Find Patterns in Time Series , 1994, KDD Workshop.

[11]  Thomas D. Gautheir Detecting Trends Using Spearman's Rank Correlation Coefficient , 2001 .

[12]  Tak-Chung Fu,et al.  Stock time series pattern matching: Template-based vs. rule-based approaches , 2007, Eng. Appl. Artif. Intell..

[13]  Huaiqing Wang,et al.  Novel Online Methods for Time Series Segmentation , 2008, IEEE Transactions on Knowledge and Data Engineering.

[14]  Eamonn J. Keogh,et al.  A Probabilistic Approach to Fast Pattern Matching in Time Series Databases , 1997, KDD.

[15]  Tak-Chung Fu,et al.  Evolutionary time series segmentation for stock data mining , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[16]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[17]  Divyakant Agrawal,et al.  A comparison of DFT and DWT based similarity search in time-series databases , 2000, CIKM '00.

[18]  Eamonn J. Keogh,et al.  A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases , 2000, PAKDD.

[19]  Cyrus Shahabi,et al.  TSA-tree: a wavelet-based approach to improve the efficiency of multi-level surprise and trend queries on time-series data , 2000, Proceedings. 12th International Conference on Scientific and Statistica Database Management.

[20]  Christos Faloutsos,et al.  A signature technique for similarity-based queries , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[21]  Changzhou Wang,et al.  Supporting fast search in time series for movement patterns in multiple scales , 1998, CIKM '98.

[22]  Xindong Wu,et al.  Mining distribution change in stock order streams , 2010, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[23]  Ivan Popivanov Efficient similarity queries over time series data using wavelets , 2001 .