Fast time-series searching with scaling and shifting

Recently, it has been found that the technique of searching for similar patterns among time series data is very important in a wide range of scientific and business applications. In this paper, we first propose a definition of similarity based on scaling and shifting transformations. Sequence A is defined to be similar to sequence B if suitable scaling and shifting transformations can be found to transform A to B. Then, we present a geometrical view of the problem so that the scaling factor and the shifting offset can be determined. Moreover, sequence searching based on tree-based indexing structure can be performed. Finally, some technical aspects are discussed and some experiments are performed on real data (stock price movement) to measure the performance of our algorithm.

[1]  Nasser Yazdani,et al.  Matching and indexing sequences of different lengths , 1997, CIKM '97.

[2]  Hagit Shatkay,et al.  Approximate queries and representations for large data sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[3]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[4]  R. Agrawal,et al.  A Fast Algorithm for high-dimensional Similarity Joins , 1998 .

[5]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[6]  Dimitrios Gunopulos,et al.  Time-series similarity problems and well-separated geometric sets , 1997, SCG '97.

[7]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[8]  Philip S. Yu,et al.  HierarchyScan: a hierarchical similarity search algorithm for databases of long sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[9]  Shin'ichi Satoh,et al.  The SR-tree: an index structure for high-dimensional nearest neighbor queries , 1997, SIGMOD '97.

[10]  A. Guttman,et al.  A Dynamic Index Structure for Spatial Searching , 1984, SIGMOD 1984.

[11]  Man Hon Wong,et al.  A Fast Projection Algorithm for Sequence Data Searching , 1998, Data Knowl. Eng..

[12]  Christos Faloutsos,et al.  The R+-Tree: A Dynamic Index for Multi-Dimensional Objects , 1987, VLDB.

[13]  Antonin Guttman,et al.  R-trees: a dynamic index structure for spatial searching , 1984, SIGMOD '84.

[14]  Dina Q. Goldin,et al.  On Similarity Queries for Time-Series Data: Constraint Specification and Implementation , 1995, CP.

[15]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[16]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[17]  A. Guttmma,et al.  R-trees: a dynamic index structure for spatial searching , 1984 .

[18]  Alberto O. Mendelzon,et al.  Similarity-based queries , 1995, PODS '95.

[19]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[20]  Alan Watt,et al.  3D Computer Graphics , 1993 .

[21]  Man Hon Wong,et al.  An Efficient Hash-Based Algorithm for Sequence Data Searching , 1998, Comput. J..

[22]  Ambuj K. Singh,et al.  Dimensionality reduction for similarity searching in dynamic databases , 1998, SIGMOD '98.

[23]  John B. Fraleigh,et al.  Linear Algebra , 1987 .

[24]  Hans-Peter Kriegel,et al.  The X-tree : An Index Structure for High-Dimensional Data , 2001, VLDB.

[25]  Harry F. Davis,et al.  Introduction to vector analysis , 1961 .

[26]  Dimitrios Gunopulos,et al.  Finding Similar Time Series , 1997, PKDD.

[27]  Man Hon Wong,et al.  A Fast Signature Algorithm for Sequence Data Searching , 1997, NGITS.

[28]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.