The Algorithm for Mining Global Frequent Itemsets based on Big Data

There were some algorithms for mining global frequent itemsets. Most of them adopted apriori-like algorithm, so that a lot of candidate itemsets were generated. To solve the problems, the algorithm for mining global frequent itemsets based on big data was proposed, namely, MGFI algorithm. MGFI algorithm computed local frequent itemsets by mapreduce, then the center node collected data, finally, global frequent itemsets were got by mapreduce. MGFI algorithm required less communication traffic by the searching strategies of top-down and bottom-up. Theoretical analysis and experimental results suggest that MGFI algorithm is fast and effective. KeywordsData Mining; Global Frequent Itemsets; Big Data; Mapreduce; FP-tree

[1]  Yan He Incremental Updating Algorithm of Global Maximum Frequent Itemsets in Distributed Database , 2012 .

[2]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[3]  He Bo Fast mining of global maximum frequent itemsets in distributed database , 2011 .

[4]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[5]  Yue Wang,et al.  Fast Algorithm for Mining Global Frequent Itemsets Based on Distributed Database , 2006, RSKT.

[6]  Bo He Fast Mining Algorithm of Association Rules Base on Cloud Computing , 2012, EMEIT 2012.

[7]  Jiawei Han,et al.  A fast distributed algorithm for mining association rules , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[8]  Rajkumar Buyya,et al.  Market-Oriented Cloud Computing: Vision, Hype, and Reality for Delivering IT Services as Computing Utilities , 2008, 2008 10th IEEE International Conference on High Performance Computing and Communications.

[9]  Kawuu W. Lin,et al.  A novel parallel algorithm for frequent pattern mining with privacy preserved in cloud computing environments , 2010, Int. J. Ad Hoc Ubiquitous Comput..

[10]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[11]  Bo He The Algorithm of Mining Frequent Itemsets Based on MapReduce , 2014 .

[12]  Kostas Tzeras,et al.  Automatic indexing based on Bayesian inference networks , 1993, SIGIR.

[13]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.