Mining Frequent Patterns in the Recent Time Window over Data Streams

Because of the fluidity and continuity of a stream data, historic transactions might become obsolete and useless as new transactions arrive. It is more desirable to mine the frequent patterns in the recent time windows of a data stream. This paper proposes a method to mine the recent frequent patterns in the sliding window of data stream. It uses a conservative method to calculate the approximate frequencies of patterns in sliding window, and uses recent frequent pattern tree (RFP-tree for short) to incrementally capture the information of transactions in sliding window by scanning them only once. Moreover, based on the nice properties of an RFP-tree, a series of algorithms are designed to efficiently maintain and mine the frequent patterns from data streams. At last, the results of experiments show that the proposed method is more efficient and scalable than other existing algorithms.

[1]  Nan Jiang,et al.  CFI-Stream: mining closed frequent itemsets in data streams , 2006, KDD '06.

[2]  Lap-Kei Lee,et al.  A simpler and more efficient deterministic scheme for finding frequent items over sliding windows , 2006, PODS '06.

[3]  Aoying Zhou,et al.  Dynamically maintaining frequent items over a data stream , 2003, CIKM '03.

[4]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[5]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[6]  Hongjun Lu,et al.  A false negative approach to mining frequent itemsets from high speed transactional data streams , 2006, Inf. Sci..

[7]  Suh-Yin Lee,et al.  Incremental Mining of Sequential Patterns over a Stream Sliding Window , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[8]  Carson Kai-Sang Leung,et al.  DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[9]  Won Suk Lee,et al.  Finding recent frequent itemsets adaptively over online data streams , 2003, KDD '03.

[10]  Raymond Chi-Wing Wong,et al.  Mining top-K frequent itemsets from data streams , 2006, Data Mining and Knowledge Discovery.

[11]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[12]  Philip S. Yu,et al.  Optimal multi-scale patterns in time series streams , 2006, SIGMOD Conference.