Mining Frequent Item Sets More Efficiently Using ITL Mining

Correlated The discovery of association rules is an important problem in data mining. It is a two-step process consisting of finding the frequent itemsets and generating association rules from them. Most of the research attention is focused on efficient methods of finding frequent itemsets because it is computationally the most expensive step. This paper presents a new data structure and a more efficient algorithm for mining frequent itemsets from typical data sets. The improvement is achieved by scanning the database just once and by reducing item traversals within transactions. The performance comparisons of the algorithm against the fastest Apriori implementation and the recently developed H-Mine algorithm are given here. These results show that the algorithm outperforms both Apriori and H-mine on several widely used test data sets.

[1]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[2]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD '00.

[3]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[4]  Li Shi Mining of association rules in distributed database , 1999 .

[5]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[6]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[7]  Charu C. Aggarwal,et al.  A Tree Projection Algorithm for Generation of Frequent Item Sets , 2001, J. Parallel Distributed Comput..

[8]  Mohammed J. Zaki Scalable Algorithms for Association Mining , 2000, IEEE Trans. Knowl. Data Eng..

[9]  Hongjun Lu,et al.  H-mine: hyper-structure mining of frequent patterns in large databases , 2001, Proceedings 2001 IEEE International Conference on Data Mining.