Single Scan Polynomial Algorithms for Frequent Itemset Mining in Big Databases

This paper considers frequent itemset mining in big transactional databases. It first introduces a novel approach (Bio-SS) that combines the bio-inspired algorithms with the single scan algorithm (SSFIM). The proposed approach addresses the limitations of SSFIM by utilizing the bio-inspired operators in the generation process. This reduces the time complexity of SSFIM from exponential to polynomial, while taking advantage of the capacity to derive the frequent itemsets by performing a single database scan, independently from the minimum support value. This allows to considerably accelerate the scan procedure compared to existing approaches, especially when dealing with large scale databases. The numerical results show that the designed Bio-SS outperforms both accurate and metaheuristics baseline FIM approaches when dealing with big databases.

[1]  Sebastián Ventura,et al.  Association rule mining using genetic programming to provide feedback to instructors from multiple‐choice quiz data , 2012, Expert Syst. J. Knowl. Eng..

[2]  Francisco Herrera,et al.  NICGAR: A Niching Genetic Algorithm to mine a diverse set of interesting quantitative association rules , 2016, Inf. Sci..

[3]  Shikha Agrawal,et al.  SET-PSO-based approach for mining positive and negative association rules , 2014, Knowledge and Information Systems.

[4]  José C. Riquelme,et al.  An evolutionary algorithm to discover numeric association rules , 2002 .

[5]  Djamel Djenouri,et al.  A new framework for metaheuristic-based frequent itemset mining , 2018, Applied Intelligence.

[6]  Chaomin Luo,et al.  A Bio-Inspired Approach to Task Assignment of Swarm Robots in 3-D Dynamic Environments , 2017, IEEE Transactions on Cybernetics.

[7]  Zhi-Hong Deng,et al.  PrePost+: An efficient N-lists-based algorithm for mining frequent itemsets via Children-Parent Equivalence pruning , 2015, Expert Syst. Appl..

[8]  Djamel Djenouri,et al.  Frequent Itemset Mining in Big Data With Effective Single Scan Algorithms , 2018, IEEE Access.

[9]  R. J. Kuo,et al.  Application of particle swarm optimization to association rule mining , 2011, Appl. Soft Comput..

[10]  Habiba Drias,et al.  Bees swarm optimisation using multiple strategies for association rule mining , 2014, Int. J. Bio Inspired Comput..

[11]  Djamel Djenouri,et al.  Diversification Heuristics in Bees Swarm Optimization for Association Rules Mining , 2017, PAKDD.

[12]  Smaine Mazouzi,et al.  Penguin Search Optimisation Algorithm for Finding Optimal Spaced Seeds , 2015, Int. J. Softw. Sci. Comput. Intell..

[13]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[14]  Marco Comuzzi,et al.  GA-Apriori: Combining Apriori Heuristic and Genetic Algorithms for Solving the Frequent Itemsets Mining Problem , 2017, PAKDD.

[15]  Das Amrita,et al.  Mining Association Rules between Sets of Items in Large Databases , 2013 .

[16]  Sebastián Ventura,et al.  G3PARM: A Grammar Guided Genetic Programming algorithm for mining association rules , 2010, IEEE Congress on Evolutionary Computation.

[17]  Djamel Djenouri,et al.  SS-FIM: Single Scan for Frequent Itemsets Mining in Transactional Databases , 2017, PAKDD.

[18]  Sankaran Mahadevan,et al.  A Bio-Inspired Approach to Traffic Network Equilibrium Assignment Problem , 2018, IEEE Transactions on Cybernetics.

[19]  Hussein A. Abbass,et al.  Co-Operative Coevolutionary Neural Networks for Mining Functional Association Rules , 2017, IEEE Transactions on Neural Networks and Learning Systems.