Apriori Parallel Improved Algorithm Based on MapReduce Distributed Architecture
暂无分享,去创建一个
Under the environment of big data, efficiency is low and there are many candidates when the traditional serial Apriori algorithm in dealing with massive data. This paper proposes a parallel better algorithm based on MapReduce distributed architecture. Based on the basic Apriori algorithm on MapReduce, this paper makes a reconstruction of the original transaction database, and implements parallel in data set fragmentation. The algorithm optimizes the transaction database, candidate item sets counting and pruning strategy. The experimental results show that the improved algorithm proposed in this paper can reduce the candidate items and improve the efficiency.
[1] Rakesh Agarwal,et al. Fast Algorithms for Mining Association Rules , 1994, VLDB 1994.
[2] Ramakrishnan Srikant,et al. Fast algorithms for mining association rules , 1998, VLDB 1998.
[3] Philip S. Yu,et al. An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.