Efficient Mining of Frequent Itemsets Using Only One Dynamic Prefix Tree

Frequent itemset mining is a fundamental problem in data mining area because frequent itemsets have been extensively used in reasoning, classifying, clustering, and so on. To mine frequent itemsets, previous algorithms based on a prefix tree structure have to construct many prefix trees, which is very time-consuming. In this paper, we propose a novel frequent itemset mining algorithm called DPT (Dynamic Prefix Tree) which uses only one prefix tree. We first introduce the concept of the post-conditional database of an itemset, and analyze the distribution of an itemset’s post-conditional database in a prefix tree representing a database. Subsequently, we illuminate how DPT adjusts the prefix tree to mine frequent itemsets and give three optimization techniques. An interesting advantage of DPT is that the algorithm can directly output a prefix tree representing all frequent itemsets after slight modifications. Using only one dynamic prefix tree, DPT avoids the high cost of constructing many prefix trees and thus gains significant performance improvement. Experimental results show that DPT remarkably outperforms previous algorithms with respect to running time and memory usage, and that a prefix tree representing all frequent itemsets DPT outputs can be used more efficient than a list representing them previous algorithms output.

[1]  Tao Jiang,et al.  Smart frequent itemsets mining algorithm based on FP-tree and DIFFset data structures , 2017, Turkish J. Electr. Eng. Comput. Sci..

[2]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[3]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[4]  Jun Huang,et al.  Constructing multicast routing tree for inter-cloud data transmission: an approximation algorithmic perspective , 2018, IEEE/CAA Journal of Automatica Sinica.

[5]  MengChu Zhou,et al.  Generating Highly Accurate Predictions for Missing QoS Data via Aggregating Nonnegative Latent Factor Models , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Hongjun Lu,et al.  Efficient Mining of Frequent Patterns Using Ascending Frequency Ordered Prefix-Tree , 2004, Data Mining and Knowledge Discovery.

[7]  Hongjun Lu,et al.  AFOPT: An Efficient Implementation of Pattern Growth Approach , 2003, FIMI.

[8]  Jerry Chun-Wei Lin,et al.  Mining Productive Itemsets in Dynamic Databases , 2020, IEEE Access.

[9]  Djamel Djenouri,et al.  Exploiting GPU and cluster parallelism in single scan frequent itemset mining , 2019, Inf. Sci..

[10]  Xiaoyun Chen,et al.  F-Miner: A New Frequent Itemsets Mining Algorithm , 2006, 2006 IEEE International Conference on e-Business Engineering (ICEBE'06).

[11]  Jesús Alcalá-Fdez,et al.  MRQAR: A generic MapReduce framework to discover quantitative association rules in big data problems , 2018, Knowl. Based Syst..

[12]  Jarek Gryz,et al.  Building FP-Tree on the Fly: Single-Pass Frequent Itemset Mining , 2016, MLDM.

[13]  Jiujun Cheng,et al.  Dendritic Neuron Model With Effective Learning Algorithms for Classification, Approximation, and Prediction , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Osmar R. Zaïane,et al.  COFI-tree Mining: A New Approach to Pattern Growth with Reduced Candidacy Generation , 2003, FIMI.

[15]  Jian Pei,et al.  Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[16]  Francisco Herrera,et al.  Mining association rules on Big Data through MapReduce genetic programming , 2017, Integr. Comput. Aided Eng..

[17]  Hiroki Arimura,et al.  LCM: An Efficient Algorithm for Enumerating Frequent Closed Item Sets , 2003, FIMI.

[18]  Kaixiang Peng,et al.  Mining temporal association rules with frequent itemsets tree , 2018, Appl. Soft Comput..

[19]  Sebastián Ventura,et al.  Frequent itemset mining: A 25 years review , 2019, WIREs Data Mining Knowl. Discov..

[20]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[21]  Bo Li,et al.  TT-Miner: Topology-Transaction Miner for Mining Closed Itemset , 2019, IEEE Access.

[22]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[23]  Umesh Chandra Pati,et al.  Sliding mode control of coupled tank systems using conditional integrators , 2020, IEEE/CAA Journal of Automatica Sinica.

[24]  Ammar Hawbani,et al.  LUIM: New Low-Utility Itemset Mining Framework , 2019, IEEE Access.

[25]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[26]  Zhiyang Li,et al.  Approximation of Probabilistic Maximal Frequent Itemset Mining Over Uncertain Sensed Data , 2020, IEEE Access.

[27]  Tzung-Pei Hong,et al.  An Efficient Method for Mining Closed Potential High-Utility Itemsets , 2020, IEEE Access.

[28]  Ming-Yen Lin,et al.  Apriori-based frequent itemset mining algorithms on MapReduce , 2012, ICUIMC.

[29]  Sebastián Ventura,et al.  High performance evaluation of evolutionary-mined association rules on GPUs , 2013, The Journal of Supercomputing.

[30]  Mykola Pechenizkiy,et al.  Apriori Versions Based on MapReduce for Mining Frequent Patterns on Big Data , 2018, IEEE Transactions on Cybernetics.

[31]  Wagner Meira,et al.  Tree Projection-Based Frequent Itemset Mining on Multicore CPUs and GPUs , 2010, 2010 22nd International Symposium on Computer Architecture and High Performance Computing.

[32]  Philippe Fournier-Viger,et al.  A survey of itemset mining , 2017, WIREs Data Mining Knowl. Discov..

[33]  Francisco Guil,et al.  Associative classification based on the Transferable Belief Model , 2019, Knowl. Based Syst..

[34]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[35]  Reda Alhajj,et al.  DRFP-tree: disk-resident frequent pattern tree , 2009, Applied Intelligence.