High performance evaluation of evolutionary-mined association rules on GPUs

Association rule mining is a well-known data mining task, but it requires much computational time and memory when mining large scale data sets of high dimensionality. This is mainly due to the evaluation process, where the antecedent and consequent in each rule mined are evaluated for each record. This paper presents a novel methodology for evaluating association rules on graphics processing units (GPUs). The evaluation model may be applied to any association rule mining algorithm. The use of GPUs and the compute unified device architecture (CUDA) programming model enables the rules mined to be evaluated in a massively parallel way, thus reducing the computational time required. This proposal takes advantage of concurrent kernels execution and asynchronous data transfers, which improves the efficiency of the model. In an experimental study, we evaluate interpreter performance and compare the execution time of the proposed model with regard to single-threaded, multi-threaded, and graphics processing unit implementation. The results obtained show an interpreter performance above 67 billion giga operations per second, and speed-up by a factor of up to 454 over the single-threaded CPU model, when using two NVIDIA 480 GTX GPUs. The evaluation model demonstrates its efficiency and scalability according to the problem complexity, number of instances, rules, and GPU devices.

[1]  Enrique Alba,et al.  Parallelism and evolutionary algorithms , 2002, IEEE Trans. Evol. Comput..

[2]  Sebastián Ventura,et al.  Speeding up the evaluation phase of GP classification algorithms on GPUs , 2012, Soft Comput..

[3]  I-En Liao,et al.  An improved frequent pattern growth method for mining association rules , 2011, Expert Syst. Appl..

[4]  Wagner Meira,et al.  Tree Projection-Based Frequent Itemset Mining on Multicore CPUs and GPUs , 2010, 2010 22nd International Symposium on Computer Architecture and High Performance Computing.

[5]  Yao Zhang,et al.  Parallel Computing Experiences with CUDA , 2008, IEEE Micro.

[6]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[7]  Sebastián Ventura,et al.  RM-Tool: A framework for discovering and evaluating association rules , 2011, Adv. Eng. Softw..

[8]  William B. Langdon,et al.  A SIMD Interpreter for Genetic Programming on GPU Graphics Cards , 2007, EuroGP.

[9]  David R. Kaeli,et al.  Exploiting Memory Access Patterns to Improve Memory Performance in Data-Parallel Architectures , 2011, IEEE Transactions on Parallel and Distributed Systems.

[10]  J. M. Serrano,et al.  Association rules applied to credit card fraud detection , 2009, Expert Syst. Appl..

[11]  Rajeev Motwani,et al.  Beyond Market Baskets: Generalizing Association Rules to Dependence Rules , 1998, Data Mining and Knowledge Discovery.

[12]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[13]  Chau-Wen Tseng,et al.  Data transformations for eliminating conflict misses , 1998, PLDI.

[14]  Wolfgang Banzhaf,et al.  Fast Genetic Programming on GPUs , 2007, EuroGP.

[15]  Daniel Sánchez,et al.  Measuring the accuracy and interest of association rules: A new framework , 2002, Intell. Data Anal..

[16]  Jiayi Zhou,et al.  Parallel frequent patterns mining algorithm on GPU , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[17]  Wen-mei W. Hwu,et al.  Illinois ECE 498AL: Programming Massively Parallel Processors , 2009 .

[18]  Wolfgang Banzhaf,et al.  Accelerating Genetic Programming through Graphics Processing Units. , 2009 .

[19]  Sebastián Ventura,et al.  Design and behavior study of a grammar-guided genetic programming algorithm for mining association rules , 2011, Knowledge and Information Systems.

[20]  Norberto F. Ezquerra,et al.  Constraining and summarizing association rules in medical data , 2006, Knowledge and Information Systems.

[21]  Martyn Amos,et al.  Enhancing GPU parallelism in nature-inspired algorithms , 2012, The Journal of Supercomputing.

[22]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[23]  Robert C. Green,et al.  Central force optimization on a GPU: a case study in high performance metaheuristics , 2011, 2011 IEEE Congress of Evolutionary Computation (CEC).

[24]  William B. Langdon,et al.  A Many Threaded CUDA Interpreter for Genetic Programming , 2010, EuroGP.

[25]  William B. Langdon Performing with CUDA , 2011, GECCO '11.

[26]  Sebastián Ventura,et al.  G3PARM: A Grammar Guided Genetic Programming algorithm for mining association rules , 2010, IEEE Congress on Evolutionary Computation.

[27]  Tien-Tsin Wong,et al.  Evolutionary Computing on Consumer Graphics Hardware , 2007, IEEE Intelligent Systems.

[28]  Xiaolong Wu,et al.  Exploiting More Parallelism from Applications Having Generalized Reductions on GPU Architectures , 2010, 2010 10th IEEE International Conference on Computer and Information Technology.

[29]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[30]  William B. Langdon,et al.  Graphics processing units and genetic programming: an overview , 2011, Soft Comput..

[31]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[32]  Christian Borgelt,et al.  EFFICIENT IMPLEMENTATIONS OF APRIORI AND ECLAT , 2003 .

[33]  Jian Hu,et al.  A Fast Parallel Association Rules Mining Algorithm Based on FP-Forest , 2008, ISNN.

[34]  Chengqi Zhang,et al.  Association Rule Mining , 2002, Lecture Notes in Computer Science.

[35]  Peter A. Whigham,et al.  Grammar-based Genetic Programming: a survey , 2010, Genetic Programming and Evolvable Machines.

[36]  Maria E. Orlowska,et al.  CCAIIA: Clustering Categorial Attributed into Interseting Accociation Rules , 1998, PAKDD.

[37]  Willi Klösgen,et al.  Explora: A Multipattern and Multistrategy Discovery Assistant , 1996, Advances in Knowledge Discovery and Data Mining.

[38]  William B. Langdon,et al.  GP on SPMD parallel graphics hardware for mega Bioinformatics data mining , 2008, Soft Comput..

[39]  Vivek K. Pallipuram,et al.  A comparative study of GPU programming models and architectures using neural networks , 2011, The Journal of Supercomputing.

[40]  Cheng Wang,et al.  Parallel data mining techniques on Graphics Processing Unit with Compute Unified Device Architecture (CUDA) , 2011, The Journal of Supercomputing.

[41]  Jaume Bacardit,et al.  Speeding up the evaluation of evolutionary learning systems using GPGPUs , 2010, GECCO '10.

[42]  Erhan Akin,et al.  An efficient genetic algorithm for automated mining of both positive and negative quantitative association rules , 2006, Soft Comput..

[43]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[44]  Xiaobo Guo,et al.  Research on Parallel Association Rules Mining on GPU , 2013 .

[45]  Wen-mei W. Hwu,et al.  Illinois ECE 498AL: Programming Massively Parallel Processors, Lecture 13: Reductions and their Implementation , 2009 .

[46]  Shichao Zhang,et al.  Association Rule Mining: Models and Algorithms , 2002 .

[47]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[48]  Nicolò Flugy Papè,et al.  Evolutionary Extraction of Association Rules: A Preliminary Study on their Effectiveness , 2009, HAIS.