Mining for weak periodic signals in time series databases

Periodicity is a particularly interesting feature, which is often inherent in real world time series data sets. In this article we propose a data mining technique for detecting multiple partial and approximate periodicities. Our approach is exploratory and follows a filter/refine paradigm. In the filter phase we introduce an autocorrelation-based algorithm that produces a set of candidate partial periodicities. The algorithm is extended to capture approximate periodicities. In the refine phase we effectively prune invalid periodicities. We conducted a series of experiments with various real-world data sets to test the performance and verify the quality of the results.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[3]  Kyuseok Shim,et al.  Mining Sequential Patterns with Regular Expression Constraints , 2002, IEEE Trans. Knowl. Data Eng..

[4]  Giuseppe Psaila,et al.  Querying Shapes of Histories , 1995, VLDB.

[5]  Jack A. Orenstein Redundancy in spatial databases , 1989, SIGMOD '89.

[6]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[7]  X.S. Wang,et al.  Discovering Frequent Event Patterns with Multiple Granularities in Time Sequences , 1998, IEEE Trans. Knowl. Data Eng..

[8]  Philip S. Yu,et al.  Mining asynchronous periodic patterns in time series data , 2000, KDD '00.

[9]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[10]  Walid G. Aref,et al.  On the Discovery of Weak Periodicities in Large Time Series , 2002, PKDD.

[11]  Gwilym M. Jenkins,et al.  Time series analysis, forecasting and control , 1972 .

[12]  Eamonn J. Keogh,et al.  A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases , 2000, PAKDD.

[13]  Eamonn J. Keogh,et al.  UCR Time Series Data Mining Archive , 1983 .

[14]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[15]  Heikki Mannila,et al.  Principles of Data Mining , 2001, Undergraduate Topics in Computer Science.

[16]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[17]  Frank Klawonn,et al.  Finding informative rules in interval sequences , 2001, Intell. Data Anal..

[18]  Simon Parsons,et al.  Principles of Data Mining by David J. Hand, Heikki Mannila and Padhraic Smyth, MIT Press, 546 pp., £34.50, ISBN 0-262-08290-X , 2004, The Knowledge Engineering Review.

[19]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[20]  Jeffrey Scott Vitter,et al.  External memory algorithms and data structures: dealing with massive data , 2001, CSUR.

[21]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[22]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[23]  Murat Kantarcioglu,et al.  Mining Cyclically Repeated Patterns , 2001, DaWaK.

[24]  Heikki Mannila,et al.  Discovering Frequent Episodes in Sequences , 1995, KDD.

[25]  P. Young,et al.  Time series analysis, forecasting and control , 1972, IEEE Transactions on Automatic Control.

[26]  Philip S. Yu,et al.  Meta-patterns: revealing hidden periodic patterns , 2001, Proceedings 2001 IEEE International Conference on Data Mining.