Fast Frequent Pattern Mining without Candidate Generations on GPU by Low Latency Memory Allocation

In this work, we propose a GPU-accelerated algorithm for frequent pattern(FP) mining without candidate generation. We observe that the existing FP-growth algorithm has critical characteristics unsuitable for GPU, including the tree data structure, deep recursion and heavy dynamic memory allocations. By utilizing iterative execution and collectively allocating memory on GPU, our proposed method significantly reduce the latency caused by large memory allocations of original FP-growth. Experiment results show that our solution outperforms baselines, including sequential FP-growth with CPU only and existing GPU-accelerated Apriori and FP-growth, on various data sets with a significant speedup, from several times to hundred times.

[1]  Fabrizio Silvestri,et al.  WebDocs: a real-life huge transactional dataset , 2004, FIMI.

[2]  Xiangke Liao,et al.  RegTT: Accelerating Tree Traversals on GPUs by Exploiting Regularities , 2016, 2016 45th International Conference on Parallel Processing (ICPP).

[3]  Stephen Jones,et al.  XMalloc: A Scalable Lock-free Dynamic Memory Allocator for Many-core Machines , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[4]  Fei Wang,et al.  Parallel Frequent Pattern Mining without Candidate Generation on GPUs , 2014, 2014 IEEE International Conference on Data Mining Workshop.

[5]  Min-Soo Kim,et al.  GMiner: A fast GPU-based frequent itemset mining method for large-scale data , 2018, Inf. Sci..

[6]  Bingsheng He,et al.  Frequent itemset mining on graphics processors , 2009, DaMoN '09.

[7]  Ke Wang,et al.  Top Down FP-Growth for Association Rule Mining , 2002, PAKDD.

[8]  Edward Y. Chang,et al.  Pfp: parallel fp-growth for query recommendation , 2008, RecSys '08.

[9]  Christian Borgelt,et al.  An implementation of the FP-growth algorithm , 2005 .

[10]  Hao Jiang,et al.  A Parallel FP-Growth Algorithm Based on GPU , 2017, 2017 IEEE 14th International Conference on e-Business Engineering (ICEBE).

[11]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[12]  Fan Zhang,et al.  GPApriori: GPU-Accelerated Frequent Itemset Mining , 2011, 2011 IEEE International Conference on Cluster Computing.

[13]  M. Steinberger,et al.  ScatterAlloc: Massively parallel dynamic memory allocation for the GPU , 2012, 2012 Innovative Parallel Computing (InPar).

[14]  Michael Goldfarb,et al.  General transformations for GPU execution of tree traversals , 2013, 2013 SC - International Conference for High Performance Computing, Networking, Storage and Analysis (SC).