论文信息 - A Space Optimization for FP-Growth

A Space Optimization for FP-Growth

Frequency mining problem comprises the core of several data mining algorithms. Among frequent pattern discovery algorithms, FP-GROWTH employs a unique search strategy using compact structures resulting in a high performance algorithm that requires only two database passes. We introduce an enhanced version of this algorithm called FP-GROWTH-TINY which can mine larger databases due to a space optimization eliminating the need for intermediate conditional pattern bases. We present the algorithms required for directly constructing a conditional FP-Tree in detail. The experiments demonstrate that our implementation has a running time performance comparable to the original algorithm while reducing memory use up to twofold.

Cevdet Aykanat | Eray Özkural | C. Aykanat | Eray Özkural

[1] Geert Wets,et al. Using association rules for product assortment decisions: a case study , 1999, KDD '99.

[2] Ulrich Güntzer,et al. Algorithms for association rule mining — a general survey and comparison , 2000, SKDD.

[3] Srinivasan Parthasarathy,et al. A localized algorithm for parallel association mining , 1997, SPAA '97.

[4] Srinivasan Parthasarathy,et al. Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.

[5] Gösta Grahne,et al. Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[6] Wojciech Szpankowski,et al. Summary structures for frequency queries on large transaction sets , 2000, Proceedings DCC 2000. Data Compression Conference.

[7] Ron Kohavi,et al. Real world performance of association rule algorithms , 2001, KDD '01.

[8] Rakesh Agrawal,et al. Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[9] Tom Brijs,et al. Profiling high frequency accident locations using associations rules , 2002 .

[10] Rakesh Agarwal,et al. Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.

[11] Rajeev Motwani,et al. Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[12] Andrea Pietracaprina,et al. Mining Frequent Itemsets using Patricia Tries , 2003, FIMI.

[13] Bart Goethals,et al. FIMI'03: Workshop on Frequent Itemset Mining Implementations , 2003 .

[14] Shamkant B. Navathe,et al. An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[15] Bart Goethals,et al. Memory issues in frequent itemset mining , 2004, SAC '04.

[16] Ramakrishnan Srikant,et al. Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[17] Jian Pei,et al. Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[18] Andreas Mueller,et al. Fast sequential and parallel algorithms for association rule mining: a comparison , 1995 .