Efficient Mining of Frequent Itemsets from Data Streams

As technology advances, floods of data can be produced and shared in many applications such as wireless sensor networks or Web click streams. This calls for efficient mining techniques for extracting useful information and knowledge from streams of data. In this paper, we propose a novel algorithm for stream mining of frequent itemsets in a limited memory environment. This algorithm uses a compact tree structure to capture important contents from streams of data. By exploiting its nice properties, such a tree structure can be easily maintained and can be used for mining frequent itemsets, as well as other patterns like constrained itemsets, even when the available memory space is small.

[1]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[2]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[3]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[4]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[5]  Zhan Li,et al.  Knowledge and Information Systems , 2007 .

[6]  Osmar R. Zaïane,et al.  Non-recursive Generation of Frequent K-itemsets from Frequent Pattern Tree Representations , 2003, DaWaK.

[7]  Jun Hong,et al.  Flexible and Efficient Information Handling, 23rd British National Conference on Databases, BNCOD 23, Belfast, Northern Ireland, UK, July 18-20, 2006, Proceedings , 2006, BNCOD.

[8]  Pourang Irani,et al.  FIsViz: A Frequent Itemset Visualizer , 2008, PAKDD.

[9]  Carson Kai-Sang Leung,et al.  Efficient Mining of Constrained Frequent Patterns from Streams , 2006, 2006 10th International Database Engineering and Applications Symposium (IDEAS'06).

[10]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[11]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[12]  Hongjun Lu,et al.  False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams , 2004, VLDB.

[13]  Carson Kai-Sang Leung,et al.  DSTree: A Tree Structure for the Mining of Frequent Sets from Data Streams , 2006, Sixth International Conference on Data Mining (ICDM'06).

[14]  Laks V. S. Lakshmanan,et al.  Exploiting succinct constraints using FP-trees , 2002, SKDD.

[15]  Abdul Rauf Baig,et al.  Max-FTP: Mining Maximal Fault-Tolerant Frequent Patterns from Databases , 2007, BNCOD.

[16]  R. Watson,et al.  Data Management , 1980, Bone Marrow Transplantation.

[17]  L. Nelson Data, data everywhere. , 1997, Critical care medicine.

[18]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[19]  Osmar R. Zaïane,et al.  COFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation , 2003, FIMI.

[20]  Carson Kai-Sang Leung,et al.  A Tree-Based Approach for Frequent Pattern Mining from Uncertain Data , 2008, PAKDD.

[21]  Ruoming Jin,et al.  An algorithm for in-core frequent itemset mining on streaming data , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[22]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[23]  Philip S. Yu,et al.  Moment: maintaining closed frequent itemsets over a stream sliding window , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[24]  Shiwei Tang,et al.  A FP-Tree-Based Method for Inverse Frequent Set Mining , 2006, BNCOD.

[25]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[26]  Laks V. S. Lakshmanan,et al.  Efficient dynamic mining of constrained frequent sets , 2003, TODS.