论文信息 - An improved frequent pattern growth method for mining association rules

An improved frequent pattern growth method for mining association rules

Many algorithms have been proposed to efficiently mine association rules. One of the most important approaches is FP-growth. Without candidate generation, FP-growth proposes an algorithm to compress information needed for mining frequent itemsets in FP-tree and recursively constructs FP-trees to find all frequent itemsets. Performance results have demonstrated that the FP-growth method performs extremely well. In this paper, we propose the IFP-growth (improved FP-growth) algorithm to improve the performance of FP-growth. There are three major features of IFP-growth. First, it employs an address-table structure to lower the complexity of forming the entire FP-tree. Second, it uses a new structure called FP-tree+ to reduce the need for building conditional FP-trees recursively. Third, by using address-table and FP-tree+ the proposed algorithm has less memory requirement and better performance in comparison with FP-tree based algorithms. The experimental results show that the IFP-growth requires relatively little memory space during the mining process. Even when the minimum support is low, the space needed by IFP-growth is about one half of that of FP-growth and about one fourth of that of nonordfp algorithm. As to the execution time, our method outperforms FP-growth by one to 300 times under different minimum supports. The proposed algorithm also outperforms nonordfp algorithm in most cases. As a result, IFP-growth is very suitable for high performance applications.

[1] Jian Pei,et al. Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[2] Philip S. Yu,et al. Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[3] Jian Pei,et al. Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[4] Srinivasan Parthasarathy,et al. New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[5] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[6] Gösta Grahne,et al. Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[7] Suh-Yin Lee,et al. Mining frequent itemsets over data streams using efficient window sliding techniques , 2009, Expert Syst. Appl..

[8] Balázs Rácz,et al. nonordfp: An FP-growth variation without rebuilding the FP-tree , 2004, FIMI.

[9] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[10] Fabrizio Silvestri,et al. kDCI: a Multi-Strategy Algorithm for Mining Frequent Sets , 2003, FIMI.