Non-iteration Parallel Algorithm for Frequent Pattern Discovery

For the high time overhead problems of Apriori algorithm while solving for the long length frequent patterns, using the MapReduce distributed programming ideas, the paper breaks the original idea of Aproiri which discovers the frequent item sets through gradually increasing the element numbers in the frequent item sets. It proposes a new non-iteration parallel algorithm about frequent pattern discovery, which can get arbitrary length frequent pattern at random. The experimental results show that the proposed algorithm has better time performance than such parallel algorithms which are under the ideas of traditional Apriori algorithm.

[1]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[2]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[3]  Reda Alhajj,et al.  Association rules mining based approach for Web usage mining , 2011, 2011 IEEE International Conference on Information Reuse & Integration.

[4]  Wilson C. Hsieh,et al.  Bigtable: A Distributed Storage System for Structured Data , 2006, TOCS.

[5]  Howard Gobioff,et al.  The Google file system , 2003, SOSP '03.

[6]  Zhen Liu,et al.  MapReduce as a programming model for association rules algorithm on Hadoop , 2010, The 3rd International Conference on Information Sciences and Interaction Sciences.

[7]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[8]  Min Zhang,et al.  The Strategy of Mining Association Rule Based on Cloud Computing , 2011, 2011 International Conference on Business Computing and Global Informatization.

[9]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.