Mining closed flexible patterns in time-series databases

In this paper, we propose an efficient algorithm, called CFP, for mining closed flexible patterns in time-series databases, where flexible gaps are allowed in a pattern. Our proposed algorithm involves three stages: transforming a time-series database into a symbolic database, generating all frequent patterns of length one from the transformed database, and mining closed flexible patterns in a depth-first search manner. In the proposed method, we design two pruning strategies and a closure checking scheme to reduce the search space and thus speed up the algorithm. The experimental results show that our algorithm outperforms the modified Apriori algorithm by an order of magnitude.

[1]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[2]  Jiawei Han,et al.  BIDE: efficient mining of frequent closed sequences , 2004, Proceedings. 20th International Conference on Data Engineering.

[3]  Johannes Gehrke,et al.  Sequential PAttern mining using a bitmap representation , 2002, KDD.

[4]  Anthony J. T. Lee,et al.  Mining inter-sequence patterns , 2009, Expert Syst. Appl..

[5]  Ramakrishnan Srikant,et al.  Mining sequential patterns , 1995, Proceedings of the Eleventh International Conference on Data Engineering.

[6]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[7]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[8]  Kyuseok Shim,et al.  SQUIRE: Sequential pattern mining with quantities , 2007, J. Syst. Softw..

[9]  Xifeng Yan,et al.  CloSpan: Mining Closed Sequential Patterns in Large Datasets , 2003, SDM.

[10]  R. Larsen,et al.  An introduction to mathematical statistics and its applications (2nd edition) , by R. J. Larsen and M. L. Marx. Pp 630. £17·95. 1987. ISBN 13-487166-9 (Prentice-Hall) , 1987, The Mathematical Gazette.

[11]  Vincent S. Tseng,et al.  Efficient mining of temporal emerging itemsets from data streams , 2009, Expert Syst. Appl..

[12]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[13]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[14]  Qiming Chen,et al.  PrefixSpan,: mining sequential patterns efficiently by prefix-projected pattern growth , 2001, Proceedings 17th International Conference on Data Engineering.

[15]  Mohammed J. Zaki,et al.  Efficient algorithms for mining closed itemsets and their lattice structure , 2005, IEEE Transactions on Knowledge and Data Engineering.

[16]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[17]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[18]  Jian Pei,et al.  CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets , 2000, ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.

[19]  M. Teisseire,et al.  Efficient mining of sequential patterns with time constraints: Reducing the combinations , 2009, Expert Syst. Appl..

[20]  Suh-Yin Lee,et al.  Incremental updates of closed frequent itemsets over continuous data streams , 2009, Expert Syst. Appl..

[21]  Rakesh Agarwal,et al.  Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[22]  Mohammed J. Zaki,et al.  SPADE: An Efficient Algorithm for Mining Frequent Sequences , 2004, Machine Learning.

[23]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[24]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[25]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[26]  Eamonn J. Keogh Fast similarity search in the presence of longitudinal scaling in time series databases , 1997, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence.

[27]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[28]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.