How to exploit high performance computing in population-based metaheuristics for solving association rule mining problem

The application of population-based metaheuristics approaches to the association rules mining problem is explored in this paper. The combination of GPU and cluster-based parallel computing techniques is investigated for the purpose of accelerating the process of extracting the correlations between items in sizeable data instances. We propose four parallel-based approaches that benefit from the cluster intensive computing in the generation process and the massively GPU threading. This is by evaluating the association rules in parallel on GPU. To validate the proposed approaches, the most used population-based metaheuristics (GA, PSO, and BSO) have been executed on a cluster of GPUs to solve benchmarks of large and big ARM instances. We used Intel Xeon 64bit quad-core processor E5520 coupled to an Nvidia Tesla C2075 GPU device. The results show that the BSO outperforms GA and PSO. They also show that the proposed solution outperforms the HPC-based ARM approaches when exploring Webdocs instance (the largest instance existing on the web). To our knowledge, this is the first work that explores the combination of GPU and cluster-based parallel computing with the population-based metaheuristics in association rule mining.

[1]  Habiba Drias,et al.  Hybrid Intelligent Method for Association Rules Mining Using Multiple Strategies , 2014, Int. J. Appl. Metaheuristic Comput..

[2]  Yunliang Chen,et al.  Mining association rules in big data with NGEP , 2014, Cluster Computing.

[3]  Arie E. Kaufman,et al.  GPU Cluster for High Performance Computing , 2004, Proceedings of the ACM/IEEE SC2004 Conference.

[4]  Bhavneet Kaur,et al.  Content Based Image Retrieval with Graphical Processing Unit , 2014 .

[5]  Wei Jiang,et al.  A Map-Reduce System with an Alternate API for Multi-core Environments , 2010, 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing.

[6]  Sebastián Ventura,et al.  Association rule mining using genetic programming to provide feedback to instructors from multiple‐choice quiz data , 2012, Expert Syst. J. Knowl. Eng..

[7]  Jin Soung Yoo,et al.  A framework of spatial co-location mining on MapReduce , 2013, 2013 IEEE International Conference on Big Data.

[8]  Jiayi Zhou,et al.  Parallel frequent patterns mining algorithm on GPU , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[9]  Azuraliza Abu Bakar,et al.  Multi-objective PSO algorithm for mining numerical association rules without a priori discretization , 2014, Expert Syst. Appl..

[10]  Qiang Ding,et al.  PARM—An Efficient Algorithm to Mine Association Rules From Spatial Data , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[11]  Shikha Agrawal,et al.  SET-PSO-based approach for mining positive and negative association rules , 2014, Knowledge and Information Systems.

[12]  Xiaonan Li,et al.  Operations research and data mining , 2008, Eur. J. Oper. Res..

[13]  Roger Champagne,et al.  Adaptation of Apriori to MapReduce to Build a Warehouse of Relations between Named Entities across the Web , 2010, 2010 Second International Conference on Advances in Databases, Knowledge, and Data Applications.

[14]  Laurent Lefèvre,et al.  A survey on techniques for improving the energy efficiency of large-scale distributed systems , 2014, ACM Comput. Surv..

[15]  Matthew Studley,et al.  Learning Classifier System Ensembles With Rule-Sharing , 2007, IEEE Transactions on Evolutionary Computation.

[16]  Hussein A. Abbass,et al.  Co-Operative Coevolutionary Neural Networks for Mining Functional Association Rules , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Azzedine Boukerche,et al.  A Novel Algorithm for Mining Association Rules in Wireless Ad Hoc Sensor Networks , 2008, IEEE Transactions on Parallel and Distributed Systems.

[18]  Habiba Drias,et al.  A hybrid Bees Swarm Optimization and Tabu Search algorithm for Association rule mining , 2013, 2013 World Congress on Nature and Biologically Inspired Computing.

[19]  AgrawalRakesh,et al.  Mining association rules between sets of items in large databases , 1993 .

[20]  Marco Comuzzi,et al.  GA-Apriori: Combining Apriori Heuristic and Genetic Algorithms for Solving the Frequent Itemsets Mining Problem , 2017, PAKDD.

[21]  Tiago Ferra de Sousa,et al.  Particle Swarm based Data Mining Algorithms for classification tasks , 2004, Parallel Comput..

[22]  Sebastián Ventura,et al.  High performance evaluation of evolutionary-mined association rules on GPUs , 2013, The Journal of Supercomputing.

[23]  Fabrizio Silvestri,et al.  WebDocs: a real-life huge transactional dataset , 2004, FIMI.

[24]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[25]  Syed Hasan Adil,et al.  Implementation of association rule mining using CUDA , 2009, 2009 International Conference on Emerging Technologies.

[26]  Salvatore Orlando,et al.  gpuDCI: Exploiting GPUs in Frequent Itemset Mining , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[27]  Giancarlo Mauri,et al.  GPU-accelerated simulations of mass-action kinetics models with cupSODA , 2014, The Journal of Supercomputing.

[28]  Edward Y. Chang,et al.  Pfp: parallel fp-growth for query recommendation , 2008, RecSys '08.

[29]  Habiba Drias,et al.  Bees swarm optimisation using multiple strategies for association rule mining , 2014, Int. J. Bio Inspired Comput..

[30]  Yongfeng Huang,et al.  An improved parallel association rules algorithm based on MapReduce framework for big data , 2014, 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[31]  José C. Riquelme,et al.  An evolutionary algorithm to discover numeric association rules , 2002 .

[32]  Srinivasan Parthasarathy,et al.  Parallel Data Mining for Association Rules on Shared-memory Systems , 1998 .

[33]  Bingsheng He,et al.  Frequent itemset mining on graphics processors , 2009, DaMoN '09.

[34]  Rakesh Agrawal,et al.  Parallel Mining of Association Rules , 1996, IEEE Trans. Knowl. Data Eng..

[35]  Wen-mei W. Hwu,et al.  Program optimization carving for GPU computing , 2008, J. Parallel Distributed Comput..

[36]  Djamel Djenouri,et al.  Data Mining-Based Decomposition for Solving the MAXSAT Problem: Toward a New Approach , 2017, IEEE Intelligent Systems.

[37]  José Cristóbal Riquelme Santos,et al.  Discovering gene association networks by multi-objective evolutionary quantitative association rules , 2014, J. Comput. Syst. Sci..

[38]  William R. Hogan,et al.  Natural Language Processing methods and systems for biomedical ontology learning , 2011, J. Biomed. Informatics.

[39]  R. J. Kuo,et al.  Application of particle swarm optimization to association rule mining , 2011, Appl. Soft Comput..

[40]  Habiba Drias,et al.  Bees Swarm Optimization for Web Association Rule Mining , 2012, 2012 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.

[41]  Vadlamani Ravi,et al.  Association rule mining using binary particle swarm optimization , 2013, Eng. Appl. Artif. Intell..

[42]  Nadia Nouali-Taboudjemat,et al.  GPU-based bees swarm optimization for association rules mining , 2014, The Journal of Supercomputing.

[43]  Marco Comuzzi,et al.  Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem , 2017, Inf. Sci..

[44]  Fabrizio Silvestri,et al.  Adaptive and resource-aware mining of frequent sets , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[45]  Fan Zhang,et al.  GPApriori: GPU-Accelerated Frequent Itemset Mining , 2011, 2011 IEEE International Conference on Cluster Computing.

[46]  Djamel Djenouri,et al.  Reducing thread divergence in GPU‐based bees swarm optimization applied to association rule mining , 2017, Concurr. Comput. Pract. Exp..

[47]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[48]  Djamel Djenouri,et al.  Parallel BSO Algorithm for Association Rules Mining Using Master/Worker Paradigm , 2015, PPAM.

[49]  Alicia Troncoso Lora,et al.  Enhancing the scalability of a genetic algorithm to discover quantitative association rules in large-scale datasets , 2015, Integr. Comput. Aided Eng..

[50]  Gagan Agrawal,et al.  Performance Issues in Parallelizing Data-Intensive Applications on a Multi-core Cluster , 2009, 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid.