论文信息 - Parallel Algorithm for Mining Maximal Frequent Patterns

Parallel Algorithm for Mining Maximal Frequent Patterns

We present a novel and powerful parallel algorithm for mining maximal frequent patterns, called Par-MinMax. It decomposes the search space by prefix-based equivalence classes, distributes work among the processors and selectively duplicates databases in such a way that each processor can compute the maximal frequent patterns independently. It utilizes multiple level backtrack pruning strategy and other novel pruning strategies, along with vertical database format, counting frequency by simple tid-list intersection operation. These techniques eliminate the need for synchronization, drastically cutting down the I/O overhead. The analysis and experimental results demonstrate the superb efficiency of our approach in comparison with the existing work.

Hui Wang | Hongjun Zhang | Zhiting Xiao | Shengyi Jiang

[1] Roberto J. Bayardo,et al. Efficiently mining long patterns from databases , 1998, SIGMOD '98.

[2] Mohammed J. Zaki,et al. Efficiently mining maximal frequent itemsets , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[3] Rakesh Agrawal,et al. Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[4] Srinivasan Parthasarathy,et al. Parallel Algorithms for Discovery of Association Rules , 1997, Data Mining and Knowledge Discovery.

[5] Mohammed J. Zaki. Parallel and distributed association mining: a survey , 1999, IEEE Concurr..

[6] Kenli Li,et al. A Maximal Frequent Itemset Algorithm , 2003, RSFDGrC.

[7] Mohammed J. Zaki,et al. ADMIT: anomaly-based data mining for intrusions , 2002, KDD.

[8] Johannes Gehrke,et al. MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[9] Heikki Mannila,et al. Verkamo: Fast Discovery of Association Rules , 1996, KDD 1996.

[10] Srinivasan Parthasarathy,et al. Parallel Data Mining for Association Rules on Shared-memory Systems , 1998 .

[11] Heikki Mannila,et al. Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[12] Ruoming Jin,et al. Shared Memory Paraellization of Data Mining Algorithms: Techniques, Programming Interface, and Performance. , 2002 .