Shape-Based Subsequence Retrieval Supporting Multiple Models in Time-Series Databases

The shape-based retrieval is defined as the operation that searches for the (sub) sequences whose shapes are similar to that of a query sequence regardless of their actual element values. In this paper, we propose a similarity model suitable for shape-based retrieval and present an indexing method for supporting the similarity model. The proposed similarity model enables to retrieve similar shapes accurately by providing the combination of various shape-preserving transformations such as normalization, moving average, and time warping. Our indexing method stores every distinct subsequence concisely into the disk-based suffix tree for efficient and adaptive query processing. We allow the user to dynamically choose a similarity model suitable for a given application. More specifically, we allow the user to determine the parameter p of the distance function when submitting a query. The result of extensive experiments revealed that our approach not only successfully finds the subsequences whose shapes are similar to a query shape but also significantly outperforms the sequence search.

[1]  Wesley W. Chu,et al.  Efficient searches for similar subsequences of different lengths in sequence databases , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[2]  Kyuseok Shim,et al.  High-dimensional similarity joins , 1997, Proceedings 13th International Conference on Data Engineering.

[3]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[4]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[5]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[6]  Yang-Sae Moon,et al.  Duality-based subsequence matching in time-series databases , 2001, Proceedings 17th International Conference on Data Engineering.

[7]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[8]  Dimitrios Gunopulos,et al.  Finding Similar Time Series , 1997, PKDD.

[9]  Alberto O. Mendelzon,et al.  Similarity-based queries for time series data , 1997, SIGMOD '97.

[10]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[11]  Man Hon Wong,et al.  Fast time-series searching with scaling and shifting , 1999, PODS '99.

[12]  Sriram Padmanabhan,et al.  Prefix-querying: an approach for effective subsequence matching under time warping in sequence databases , 2001, CIKM '01.

[13]  Dina Q. Goldin,et al.  On Similarity Queries for Time-Series Data: Constraint Specification and Implementation , 1995, CP.