Improvement of Apriori Algorithm Using Parallelization Technique on Multi-CPU and GPU Topology

In the domain of data mining, the extraction of frequent patterns from expansive datasets remains a daunting task, compounded by the intricacies of temporal and spatial dimensions. While the Apriori algorithm is seminal in this area, its constraints are accentuated when navigating larger datasets. In response, we introduce an avant-garde solution that leverages parallel network topologies and GPUs. At the heart of our method are two salient features: (1) the use of parallel processing to expedite the realization of optimal results and (2) the integration of the cat and mouse-based optimizer (CMBO) algorithm, an astute algorithm mirroring the instinctual dynamics between predatory cats and evasive mice. This optimizer is structured around a biphasic model: an initial aggressive pursuit by the cats and a subsequent calculated evasion by the mice. This structure is enriched by classifying agents using their objective function scores. Complementing this, our architectural blueprint seamlessly amalgamates dual Nvidia graphics cards in a parallel configuration, establishing a marked ascendancy over conventional CPUs. In amalgamation, our approach not only rectifies the inherent shortfalls of the Apriori algorithm but also accentuates the extraction of association rules, pinpointing frequent patterns with enhanced precision. A comprehensive evaluation across a spectrum of network topologies explains their respective merits and demerits. Set against the benchmark of the Apriori algorithm, our method conspicuously outperforms in terms of speed and effectiveness, heralding a significant stride forward in data mining research.

[1]  Shervan Fekri-Ershad,et al.  Innovative local texture descriptor in joint of human-based color features for content-based image retrieval , 2023, Signal, Image and Video Processing.

[2]  C. J. Carmona,et al.  A distributed evolutionary fuzzy system-based method for the fusion of descriptive emerging patterns in data streams , 2022, Inf. Fusion.

[3]  Yiming Fang,et al.  Multi-swarm improved moth-flame optimization algorithm with chaotic grouping and Gaussian mutation for solving engineering optimization problems , 2022, Expert Syst. Appl..

[4]  Mohammad Karim Sohrabi,et al.  Exploiting parallel graphics processing units to improve association rule mining in transactional databases using butterfly optimization algorithm , 2021, Cluster Computing.

[5]  Ian T. Foster,et al.  Topology-aware optimizations for multi-GPU ptychographic image reconstruction , 2021, ICS.

[6]  Djamel Djenouri,et al.  Exploiting GPU parallelism in improving bees swarm optimization for mining big transactional databases , 2019, Inf. Sci..

[7]  Isabel de la Torre Díez,et al.  Data Mining Algorithms and Techniques in Mental Health: A Systematic Review , 2018, Journal of Medical Systems.

[8]  Vaclav Zeman,et al.  EasyMiner.eu: Web framework for interpretable machine learning based on rules and frequent itemsets , 2018, Knowl. Based Syst..

[9]  Shuling Zhu,et al.  Research on data mining of education technical ability training for physical education students based on Apriori algorithm , 2018, Cluster Computing.

[10]  Canlong Zhang,et al.  Automatic image annotation using fuzzy association rules and decision tree , 2017, Multimedia Systems.

[11]  Philippe Fournier-Viger,et al.  A survey of itemset mining , 2017, WIREs Data Mining Knowl. Discov..

[12]  Hussein A. Abbass,et al.  Co-Operative Coevolutionary Neural Networks for Mining Functional Association Rules , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[13]  Eamonn J. Keogh,et al.  The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances , 2016, Data Mining and Knowledge Discovery.

[14]  Michelangelo Ceci,et al.  CloFAST: closed sequential pattern mining using sparse and vertical id-lists , 2016, Knowledge and Information Systems.

[15]  Francesco Palmieri,et al.  Developing a trust model for pervasive computing based on Apriori association rules learning and Bayesian classification , 2016, Soft Computing.

[16]  Lei Yang,et al.  Ant colony classification mining algorithm based on pheromone attraction and exclusion , 2016, Soft Computing.

[17]  Zhi-Hong Deng,et al.  Fast mining frequent itemsets using Nodesets , 2014, Expert Syst. Appl..

[18]  Alex Alves Freitas,et al.  A New Sequential Covering Strategy for Inducing Classification Rules With Ant Colony Algorithms , 2013, IEEE Transactions on Evolutionary Computation.

[19]  Salvatore Orlando,et al.  gpuDCI: Exploiting GPUs in Frequent Itemset Mining , 2012, 2012 20th Euromicro International Conference on Parallel, Distributed and Network-based Processing.

[20]  Bingsheng He,et al.  Frequent itemset mining on graphics processors , 2009, DaMoN '09.

[21]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[22]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[23]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[24]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[25]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[26]  Hayder Naser Khraibet,et al.  HYBRID ANT COLONY OPTIMIZATION AND ITERATED LOCAL SEARCH FOR RULES-BASED CLASSIFICATION , 2020 .

[27]  Elena Baralis,et al.  Data Mining in Databases: Languages and Indices , 2018, A Comprehensive Guide Through the Italian Database Research.

[28]  A. Pavithra,et al.  Comparative Study of Effective Performance of Association Rule Mining in Different Databases , 2018 .

[29]  Osmar R Zaiane,et al.  Educational data mining applications and tasks: A survey of the last 10 years , 2017, Education and Information Technologies.

[30]  Alex A. Freitas,et al.  A hybrid PSO/ACO algorithm for discovering classification rules in data mining , 2008 .