Multi-Objective Optimization for High-Dimensional Maximal Frequent Itemset Mining

The solution space of a frequent itemset generally presents exponential explosive growth because of the high-dimensional attributes of big data. However, the premise of the big data association rule analysis is to mine the frequent itemset in high-dimensional transaction sets. Traditional and classical algorithms such as the Apriori and FP-Growth algorithms, as well as their derivative algorithms, are unacceptable in practical big data analysis in an explosive solution space because of their huge consumption of storage space and running time. A multi-objective optimization algorithm was proposed to mine the frequent itemset of high-dimensional data. First, all frequent 2-itemsets were generated by scanning transaction sets based on which new items were added in as the objects of population evolution. Algorithms aim to search for the maximal frequent itemset to gather more non-void subsets because non-void subsets of frequent itemsets are all properties of frequent itemsets. During the operation of algorithms, lethal gene fragments in individuals were recorded and eliminated so that individuals may resurge. Finally, the set of the Pareto optimal solution of the frequent itemset was gained. All non-void subsets of these solutions were frequent itemsets, and all supersets are non-frequent itemsets. Finally, the practicability and validity of the proposed algorithm in big data were proven by experiments.

[1]  Soumya Sen,et al.  AFARTICA: A Frequent Item-Set Mining Method Using Artificial Cell Division Algorithm , 2019, J. Database Manag..

[2]  Felix Naumann,et al.  Discovering Relaxed Functional Dependencies Based on Multi-Attribute Dominance , 2021, IEEE Transactions on Knowledge and Data Engineering.

[3]  Jeff Heaton,et al.  Comparing dataset characteristics that favor the Apriori, Eclat or FP-Growth frequent itemset mining algorithms , 2016, SoutheastCon 2016.

[4]  Hexu Sun,et al.  Reliability Assessment of Wind Power Converter Considering SCADA Multistate Parameters Prediction Using FP-Growth, WPT, K-Means and LSTM Network , 2020, IEEE Access.

[5]  Mourad Ykhlef,et al.  A Quantum Swarm Evolutionary Algorithm for mining association rules in large databases , 2011, J. King Saud Univ. Comput. Inf. Sci..

[6]  Vincenzo Deufemia,et al.  Mining relaxed functional dependencies from data , 2019, Data Mining and Knowledge Discovery.

[7]  R. Agrawal,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[8]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[9]  R. Vijay Anand,et al.  Handling stakeholder conflict by agile requirement prioritization using Apriori technique , 2017, Comput. Electr. Eng..

[10]  D. Hanirex An Efficient TDTR Algorithm for Mining Frequent Itemsets , 2013 .

[11]  Shuxiang Xu,et al.  Association rule mining for both frequent and infrequent items using particle swarm optimization algorithm , 2014 .

[12]  Bekti Cahyo Hidayanto,et al.  Network Intrusion Detection Systems Analysis using Frequent Item Set Mining Algorithm FP-Max and Apriori , 2017 .

[13]  James Geller,et al.  Data Mining: Practical Machine Learning Tools and Techniques - Book Review , 2002, SIGMOD Rec..

[14]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[15]  Sikha Bagui,et al.  Mining frequent itemsets from streaming transaction data using genetic algorithms , 2020, Journal of Big Data.

[16]  Zhan Xue-gang Network services of personalized recommendation based on association rules , 2004 .

[17]  R. J. Kuo,et al.  Application of particle swarm optimization to association rule mining , 2011, Appl. Soft Comput..

[18]  Jian Huang,et al.  Frequent item sets mining from high-dimensional dataset based on a novel binary particle swarm optimization , 2016 .

[19]  María José del Jesús,et al.  KEEL: a software tool to assess evolutionary algorithms for data mining problems , 2008, Soft Comput..

[20]  AgrawalRakesh,et al.  Mining quantitative association rules in large relational tables , 1996 .