A Fast Algorithm for Maintenance of Association Rules in Incremental Databases

In this paper, we propose an algorithm for maintaining the frequent itemsets discovered in a database with minimal re-computation when new transactions are added to or old transactions are removed from the transaction database. An efficient algorithm called EFPIM (Extending FP-tree for Incremental Mining), is designed based on EFP-tree (extended FP-tree) structures. An important feature of our algorithm is that it requires no scan of the original database, and the new EFP-tree structure of the updated database can be obtained directly from the EFP-tree of the original database. We give two versions of EFPIM algorithm, called EFPIM1 (an easy vision to implement) and EFPIM2 (a fast algorithm), they both mining frequent itemsets of the updated database based on EFP-tree. Experimental results show that EFPIM outperforms the existing algorithms in terms of the execution time.

[1]  Jia-Ling Koh,et al.  An Efficient Approach for Maintaining Association Rules Based on Adjusting FP-Tree Structures1 , 2004, DASFAA.

[2]  D. Cheung,et al.  Maintenance of Discovered Association Rules , 2002 .

[3]  Jian Pei,et al.  Mining frequent patterns by pattern-growth: methodology and implications , 2000, SKDD.

[4]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[5]  David Wai-Lok Cheung,et al.  A General Incremental Technique for Maintaining Discovered Association Rules , 1997, DASFAA.

[6]  Sanjay Ranka,et al.  An Efficient Algorithm for the Incremental Updation of Association Rules in Large Databases , 1997, KDD.

[7]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[8]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[9]  Jian Pei,et al.  CLOSET+: searching for the best strategies for mining frequent closed itemsets , 2003, KDD '03.

[10]  Mohammed J. Zaki,et al.  CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[11]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[12]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[13]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.