A Survey of the Different Itemset Representation for Candidate Generation

Itemset representation is a pivotal part in association rule mining. The itemset representation is a way of how the itemset in the dataset are stored in the memory. There are different data structures used for storing such itemsets. Some of the common data structures used are linked list, array. The efficiency of the association rule mining algorithm largely depends on the way the itemsets are stored. The execution time and memory consumption of the itemsets plays a vital role for determining the performance of the mining algorithms. In this paper, a study of the data structures used for itemset representation is discussed. The different data structures are being tested on the different datasets for generation of candidate itemsets. The performance of the different data structures in candidate generation process is analysed.

[1]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[2]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[3]  Yuh-Jiuan Tsay,et al.  CBAR: an efficient method for mining association rules , 2005, Knowl. Based Syst..

[4]  Kurt Hornik,et al.  A CLUE for CLUster Ensembles , 2005 .

[5]  Sanguthevar Rajasekaran,et al.  A transaction mapping algorithm for frequent itemsets mining , 2006 .

[6]  Zoe L. Jiang,et al.  HashEclat: an efficient frequent itemset algorithm , 2019, Int. J. Mach. Learn. Cybern..

[7]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[8]  Gösta Grahne,et al.  Efficiently Using Prefix-trees in Mining Frequent Itemsets , 2003, FIMI.

[9]  Mengjiao Wang,et al.  Frequent Item-set Mining without Ubiquitous Items , 2018, ArXiv.

[10]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[11]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[12]  Jerry Chun-Wei Lin,et al.  A Survey of High Utility Itemset Mining , 2019, Studies in Big Data.

[13]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[14]  Hongjun Lu,et al.  H-mine: hyper-structure mining of frequent patterns in large databases , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[15]  Kurt Hornik,et al.  Introduction to arules – A computational environment for mining association rules and frequent item sets , 2009 .

[16]  V. Boonjing,et al.  IIS-Mine: A new efficient method for mining frequent itemsets , 2012 .

[17]  Carynthia Kharkongor,et al.  Set Representation for Itemsets in Association Rule Mining , 2018, 2018 Second International Conference on Intelligent Computing and Control Systems (ICICCS).

[18]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[19]  Zvi M. Kedem,et al.  Pincer-Search: A New Algorithm for Discovering the Maximum Frequent Set , 1998, EDBT.

[20]  Ferenc Bodon,et al.  A trie-based APRIORI implementation for mining frequent item sequences , 2005 .

[21]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[22]  Vincent S. Tseng,et al.  Mining closed+ high utility itemsets without candidate generation , 2015, 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI).