HPFP-Miner: A Novel Parallel Frequent Itemset Mining Algorithm

Frequent itemset mining is a fundamental and essential issue in data mining field and can be used in many data mining tasks. Most of these mining tasks require multiple passes over the database and if the database size is large, which is usually the case, scalable high performance solutions involving multiple processors are required. In this paper, we present a novel parallel frequent itemset mining algorithm which is called HPFP-Miner. The proposed algorithm is based on FP-Growth and introduces little communication overheads by efficiently partitioning the list of frequent elements list over processors. The results of experiment show that HPFP-Miner has good scalability and performance.

[1]  Osmar R. Zaïane,et al.  Fast parallel association rule mining without candidacy generation , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[2]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[3]  Xiaoyun Chen,et al.  F-Miner: A New Frequent Itemsets Mining Algorithm , 2006, 2006 IEEE International Conference on e-Business Engineering (ICEBE'06).

[4]  Ashfaq Khokhar,et al.  Frequent Pattern Mining on Message Passing Multiprocessor Systems , 2004, Distributed and Parallel Databases.

[5]  王建勇,et al.  Parallel Frequent Pattern Discovery: Challenges and Methodology , 2007 .

[6]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[7]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[8]  Hongjun Lu,et al.  AFOPT: An Efficient Implementation of Pattern Growth Approach , 2003, FIMI.

[9]  Gösta Grahne,et al.  Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[10]  Pengfei Chen,et al.  A High Performance Algorithm for Mining Frequent Patterns: LPS-Miner , 2008, 2008 International Symposium on Information Science and Engineering.

[11]  J. Yu,et al.  Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree , 2004, Data Mining and Knowledge Discovery.

[12]  Jianyong Wang,et al.  Parallel Frequent Pattern Discovery: Challenges and Methodology , 2007 .

[13]  Hongjun Lu,et al.  Ascending frequency ordered prefix-tree: efficient mining of frequent patterns , 2003, Eighth International Conference on Database Systems for Advanced Applications, 2003. (DASFAA 2003). Proceedings..