The maintenance of representative frequent itemsets

Mining frequent itemsets is to discover the groups of items appearing always together excess of a user specified threshold from a transaction database. However, there may be many frequent itemsets existing in a transaction database, such that it is difficult to make a decision for a decision maker. Recently, mining frequent closed itemsets becomes a major research issue, since all frequent itemsets can be derived from frequent closed itemsets. In addition, the transactions in a database will be increased and removed constantly. It is a challenge that how to update the previous frequent closed itemsets from the increased and removed transactions. In our previous researches, we have proposed an algorithm MRFI to maintain the frequent closed itemsets when the transactions are added into a transaction database. In this paper, we propose an efficient algorithm for maintaining frequent closed itemsets when the transactions are deleted from a transaction database without scanning original database. Our algorithm updates closed itemsets by some rules without taking a lot of time to search the previous closed itemsets. The experimental results show that our algorithm significantly outperforms the previous approaches which need to take a lot of time to search the previous closed itemsets.

[1]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[2]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[3]  Rajeev Motwani,et al.  Approximate Frequency Counts over Data Streams , 2012, VLDB.

[4]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[5]  Nan Jiang,et al.  CFI-Stream: mining closed frequent itemsets in data streams , 2006, KDD '06.

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  Philip S. Yu,et al.  Mining Frequent Patterns in Data Streams at Multiple Time Granularities , 2002 .

[8]  Philip S. Yu,et al.  Moment: maintaining closed frequent itemsets over a stream sliding window , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[9]  Ruoming Jin,et al.  An algorithm for in-core frequent itemset mining on streaming data , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[10]  Hongjun Lu,et al.  False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams , 2004, VLDB.

[11]  Salvatore Orlando,et al.  Fast and memory efficient mining of frequent closed itemsets , 2006, IEEE Transactions on Knowledge and Data Engineering.

[12]  Bingru Yang,et al.  An Adaptive Frequent Itemset Mining Algorithm for Data Stream with Concept Drifts , 2008, 2008 International Conference on Computer Science and Software Engineering.

[13]  Yue-Shi Lee,et al.  MRFI-The maintenance of representative frequent itemsets , 2009, 2009 IEEE International Conference on Granular Computing.

[14]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[15]  Mohammed J. Zaki,et al.  Efficient algorithms for mining closed itemsets and their lattice structure , 2005, IEEE Transactions on Knowledge and Data Engineering.

[16]  Jia-Ling Koh,et al.  An Efficient Approach for Maintaining Association Rules Based on Adjusting FP-Tree Structures1 , 2004, DASFAA.