An Improved Eclat Algorithm for Mining Association Rules Based on Increased Search Strategy

Although Eclat algorithm is an efficient algorithm for mining association rules, there are some disadvantages which limit the efficient of Eclat. In this paper, we proposed an improved Eclat algorithm called Eclat_growth which is based on the increased search strategy. There are three main steps in the Eclat_growth algorithm. First, it scans the database and stores it into a table using vertical data format. Then, it builds an increased two-dimensional pattern tree and the TID_sets of itemsets in the vertical data format table are added into the pattern tree row by row. New frequent itemsets are generated by combining the new added item data with the existing frequent itemsets in the pattern tree. Finally, all frequent itemsets can be found by picking up all nodes of the pattern tree. In the process of generating new frequent itemsets, the prior knowledge is used to fully clip the candidate itemsets. In the process of generating an intersection of two itemsets and calculating the support degree, we proposed a new method called BSRI (Boolean array setting and retrieval by indexes of transactions) to reduce the run time. By comparing Eclat_growth with Eclat, Eclat-diffsets, Eclat-opt and hEclat, it is indicated that Eclat_growth has the highest performance in mining associating rules from various databases.

[1]  Balázs Rácz,et al.  nonordfp: An FP-growth variation without rebuilding the FP-tree , 2004, FIMI.

[2]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[3]  Jie Dong,et al.  BitTableFI: An efficient mining frequent itemsets algorithm , 2007, Knowl. Based Syst..

[4]  I-En Liao,et al.  An improved frequent pattern growth method for mining association rules , 2011, Expert Syst. Appl..

[5]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[6]  Nicolás Marín,et al.  TBAR: An efficient method for association rule mining in relational databases , 2001, Data Knowl. Eng..

[7]  Ke Wang,et al.  Top Down FP-Growth for Association Rule Mining , 2002, PAKDD.

[8]  Mohammed J. Zaki,et al.  Fast vertical mining using diffsets , 2003, KDD '03.

[9]  Suh-Yin Lee,et al.  Mining frequent itemsets over data streams using efficient window sliding techniques , 2009, Expert Syst. Appl..

[10]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[11]  Lars Schmidt-Thieme,et al.  Algorithmic Features of Eclat , 2004, FIMI.

[12]  Jian Pei,et al.  H-Mine: Fast and space-preserving frequent pattern mining in large databases , 2007 .

[13]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[14]  Gösta Grahne,et al.  Fast algorithms for frequent itemset mining using FP-trees , 2005, IEEE Transactions on Knowledge and Data Engineering.

[15]  Zhang Yu-fang Improvement of Eclat algorithm for association rules based on hash Boolean matrix , 2010 .

[16]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[17]  Liu Yu Qiu Qing-ying Li Li-xin Feng Pei-en Strategies of efficiency improvement for Eclat algorithm , 2013 .

[18]  Mitica Craus,et al.  Grid implementation of the Apriori algorithm , 2007, Adv. Eng. Softw..