A Survey on Representation for Itemsets in Association Rule Mining

Mining frequent itemset is one of the main challenges in association rule mining. The efficiency of frequent itemsets depends on the computation time and the data structure used for storing the itemsets. The data structure greatly influences the space requirement. Most of the algorithms work well for a sparse dataset. However, if the dataset is large, it becomes difficult for computation, which eventually increases the execution time. This will affect the scalability of the algorithm. With a compact and concise representation of the itemsets, the itemsets can fit in the memory and hence, do not require any I/O operations. The data structures that are mostly used are array, tree, and trie. In this paper, we present a comparison of the different data structures that are used by the mining algorithms.

[1]  Young-Koo Lee,et al.  Efficient single-pass frequent pattern mining using a prefix-tree , 2009, Inf. Sci..

[2]  Thanh-Trung Nguyen,et al.  Mining incrementally closed item sets with constructive pattern set , 2018, Expert Syst. Appl..

[3]  Jinlin Chen,et al.  BISC: A bitmap itemset support counting approach for efficient frequent itemset mining , 2010, TKDD.

[4]  M. Narasimha Murty,et al.  Tree structure for efficient data mining using rough sets , 2003, Pattern Recognit. Lett..

[5]  F. Bodon A fast APRIORI implementation. (RPI CS Department technical report TR 03-14.) , 2003 .

[6]  Christie I. Ezeife,et al.  A Low-Scan Incremental Association Rule Maintenance Method Based on the Apriori Property , 2001, Canadian Conference on AI.

[7]  Lars Schmidt-Thieme,et al.  Algorithmic Features of Eclat , 2004, FIMI.

[8]  Gösta Grahne,et al.  Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[9]  Balázs Rácz,et al.  nonordfp: An FP-growth variation without rebuilding the FP-tree , 2004, FIMI.

[10]  David Wai-Lok Cheung,et al.  A General Incremental Technique for Maintaining Discovered Association Rules , 1997, DASFAA.

[11]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[12]  Andrea Pietracaprina,et al.  Mining Frequent Itemsets using Patricia Tries , 2003, FIMI.

[13]  Sanguthevar Rajasekaran,et al.  A transaction mapping algorithm for frequent itemsets mining , 2006 .

[14]  Hiroki Arimura,et al.  LCM ver.3: collaboration of array, bitmap and prefix tree for frequent itemset mining , 2005 .

[15]  Ramakrishnan Srikant,et al.  Mining Sequential Patterns: Generalizations and Performance Improvements , 1996, EDBT.

[16]  Elena Baralis,et al.  A persistent HY-Tree to efficiently support itemset mining on large datasets , 2010, SAC '10.

[17]  Christie I. Ezeife,et al.  Mining Incremental Association Rules with Generalized FP-Tree , 2002, Canadian Conference on AI.

[18]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[19]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[20]  Carson Kai-Sang Leung,et al.  CanTree: a tree structure for efficient incremental mining of frequent patterns , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).