Exploiting GPU parallelism in improving bees swarm optimization for mining big transactional databases

Abstract This paper investigates the use of GPU (Graphics Processing Unit) in improving the bees swarm optimization metaheuristic performance for solving the association rule mining problem. Although this metaheuristic proved its effectiveness, it requires huge computational resource when considering big databases for mining. To overcome this limitation, we develop in this paper a GPU-based Bees Swarm Optimization Miner (GBSO-Miner) where the GPU is used as a co-processor to compute the CPU-time intensive steps of the algorithm. Unlike state-of-the-art GPU-based ARM methods, all BSO steps including the determination of search area, the local search, the evaluation, and the dancing are performed on GPU. A mapping method between the data input of each task and the GPU blocks/threads is developed. To demonstrate the effectiveness of the GBSO-Miner framework, intensive experiments have been carried out. The results show that GBSO-Miner outperforms the baseline methods of the literature (GPApriroi, MEGPU, and Dmine) using big textual and graph databases. The results reveal that GBSO-Miner is up to 800 times faster than an optimized CPU-Implementation.

[1]  Wei Cao,et al.  Multi-objective association rule mining with binary bat algorithm , 2016, Intell. Data Anal..

[2]  Diego R. Amancio,et al.  A Complex Network Approach to Stylometry , 2015, PloS one.

[3]  N. B. Anuar,et al.  The rise of "big data" on cloud computing: Review and open research issues , 2015, Inf. Syst..

[4]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[5]  Habiba Drias,et al.  Bees swarm optimisation using multiple strategies for association rule mining , 2014, Int. J. Bio Inspired Comput..

[6]  Tsutomu Maruyama,et al.  Performance comparison of FPGA, GPU and CPU in image processing , 2009, 2009 International Conference on Field Programmable Logic and Applications.

[7]  Philippe Fournier-Viger,et al.  Maintenance algorithm for high average-utility itemsets with transaction deletion , 2018, Applied Intelligence.

[8]  Daniel R. Figueiredo,et al.  struc2vec: Learning Node Representations from Structural Identity , 2017, KDD.

[9]  Smaine Mazouzi,et al.  Penguin Search Optimisation Algorithm for Finding Optimal Spaced Seeds , 2015, Int. J. Softw. Sci. Comput. Intell..

[10]  Habiba Drias,et al.  Multi-swarm bat algorithm for association rule mining using multiple cooperative strategies , 2016, Applied Intelligence.

[11]  Djamel Djenouri,et al.  Data Mining-Based Decomposition for Solving the MAXSAT Problem: Toward a New Approach , 2017, IEEE Intelligent Systems.

[12]  Lei Chen,et al.  Efficient distributed subgraph similarity matching , 2015, The VLDB Journal.

[13]  Fan Zhang,et al.  GPApriori: GPU-Accelerated Frequent Itemset Mining , 2011, 2011 IEEE International Conference on Cluster Computing.

[14]  Diego R. Amancio,et al.  Comparing the topological properties of real and artificially generated scientific manuscripts , 2015, Scientometrics.

[15]  OVEIS ABEDINIA,et al.  A new metaheuristic algorithm based on shark smell optimization , 2016, Complex..

[16]  Min-Soo Kim,et al.  GMiner: A fast GPU-based frequent itemset mining method for large-scale data , 2018, Inf. Sci..

[17]  Djamel Djenouri,et al.  Reducing thread divergence in GPU‐based bees swarm optimization applied to association rule mining , 2017, Concurr. Comput. Pract. Exp..

[18]  Azuraliza Abu Bakar,et al.  Multi-objective PSO algorithm for mining numerical association rules without a priori discretization , 2014, Expert Syst. Appl..

[19]  R. J. Kuo,et al.  Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan , 2007, Expert Syst. Appl..

[20]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[21]  O. Abedinia,et al.  Multi-objective Environmental/Economic Dispatch using firefly technique , 2012, 2012 11th International Conference on Environment and Electrical Engineering.

[22]  Giancarlo Mauri,et al.  GPU-accelerated simulations of mass-action kinetics models with cupSODA , 2014, The Journal of Supercomputing.

[23]  Shikha Agrawal,et al.  SET-PSO-based approach for mining positive and negative association rules , 2014, Knowledge and Information Systems.

[24]  Sebastián Ventura,et al.  Mining association rules with single and multi-objective grammar guided ant programming , 2013, Integr. Comput. Aided Eng..

[25]  Vincent Leroy,et al.  Efficient and Versatile FPGA Acceleration of Support Counting for Stream Mining of Sequences and Frequent Itemsets , 2017, ACM Trans. Reconfigurable Technol. Syst..

[26]  Jiayi Zhou,et al.  Parallel frequent patterns mining algorithm on GPU , 2010, 2010 IEEE International Conference on Systems, Man and Cybernetics.

[27]  R. J. Kuo,et al.  Association rule mining through the ant colony system for National Health Insurance Research Database in Taiwan , 2007, Comput. Math. Appl..

[28]  Youcef Djenouri,et al.  Bees swarm optimization guided by data mining techniques for document information retrieval , 2018, Expert Syst. Appl..

[29]  Meikang Qiu,et al.  A Real-Time FPGA-Based Accelerator for ECG Analysis and Diagnosis Using Association-Rule Mining , 2016, ACM Trans. Embed. Comput. Syst..

[30]  Philippe Fournier-Viger,et al.  Fast and effective cluster-based information retrieval using frequent closed itemsets , 2018, Inf. Sci..

[31]  Russell C. Eberhart,et al.  A discrete binary version of the particle swarm algorithm , 1997, 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation.

[32]  Bingsheng He,et al.  Frequent itemset mining on graphics processors , 2009, DaMoN '09.

[33]  Mansour Sheikhan,et al.  Gravitational search algorithm–optimized neural misuse detector with selected features by fuzzy grids–based association rules mining , 2012, Neural Computing and Applications.

[34]  Vadlamani Ravi,et al.  Association rule mining using binary particle swarm optimization , 2013, Eng. Appl. Artif. Intell..

[35]  Luciano da Fontoura Costa,et al.  Structure-semantics interplay in complex networks and its effects on the predictability of similarity in texts , 2012, ArXiv.

[36]  Philippe Fournier-Viger,et al.  Extracting useful knowledge from event logs: A frequent itemset mining approach , 2018, Knowl. Based Syst..

[37]  Marco Comuzzi,et al.  Combining Apriori heuristic and bio-inspired algorithms for solving the frequent itemsets mining problem , 2017, Inf. Sci..

[38]  R. J. Kuo,et al.  Application of particle swarm optimization to association rule mining , 2011, Appl. Soft Comput..

[39]  Alex Alves Freitas,et al.  Data mining with an ant colony optimization algorithm , 2002, IEEE Trans. Evol. Comput..

[40]  Xindong Wu,et al.  Data mining with big data , 2014, IEEE Transactions on Knowledge and Data Engineering.

[41]  Djamel Djenouri,et al.  Intelligent mapping between GPU and cluster computing for discovering big association rules , 2018, Appl. Soft Comput..

[42]  Jure Leskovec,et al.  node2vec: Scalable Feature Learning for Networks , 2016, KDD.

[43]  Jerry Chun-Wei Lin,et al.  Maintenance of discovered high average-utility itemsets in dynamic databases , 2018 .

[44]  Maria Bardosova,et al.  Using network science and text analytics to produce surveys in a scientific topic , 2015, J. Informetrics.

[45]  Sohag Kabir,et al.  Penguins Search Optimisation Algorithm for Association Rules Mining , 2016, J. Comput. Inf. Technol..

[46]  Djamel Djenouri,et al.  Bee swarm optimization for solving the MAXSAT problem using prior knowledge , 2017, Soft Computing.

[47]  Djamel Djenouri,et al.  SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases , 2017, PAKDD.