论文信息 - Association Rules Mining Using Heavy Itemsets

Association Rules Mining Using Heavy Itemsets

A well-known problem that limits the practical usage of association rule mining algorithms is the extremely large number of rules generated. Such a large number of rules makes the algorithms inefficient and makes it difficult for the end users to comprehend the discovered rules. We present the concept of a heavy itemset. An itemset A is heavy (for given support and confidence values) if all possible association rules made up of items only in A are present. We prove a simple necessary and sufficient condition for an itemset to be heavy. We present a formula for the number of possible rules for a given heavy itemset, and show that a heavy itemset compactly represents an exponential number of association rules. Along with two simple search algorithms, we present an efficient greedy algorithm to generate a collection of disjoint heavy itemsets in a given transaction database. We then present a modified apriori algorithm that starts with a given collection of disjoint heavy itemsets and discovers more heavy itemsets, not necessarily disjoint with the given ones.

Girish Keshav Palshikar | Mandar S. Kale | Manoj M. Apte

[1] M ApteManoj,et al. Association rules mining using heavy itemsets , 2007 .

[2] Wynne Hsu,et al. Pruning and summarizing the discovered associations , 1999, KDD '99.

[3] Philip S. Yu,et al. Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997, IEEE Trans. Knowl. Data Eng..

[4] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[5] Ron Kohavi,et al. Real world performance of association rule algorithms , 2001, KDD '01.

[6] Rajeev Motwani,et al. Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.

[7] Edith Cohen,et al. Finding interesting associations without support pruning , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[8] Mohammed J. Zaki,et al. CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[9] Richard M. Karp,et al. A simple algorithm for finding frequent elements in streams and bags , 2003, TODS.

[10] Zvi M. Kedem,et al. Pincer-Search: An Efficient Algorithm for Discovering the Maximum Frequent Set , 2002, IEEE Trans. Knowl. Data Eng..

[11] Hannu Toivonen,et al. Sampling Large Databases for Association Rules , 1996, VLDB.

[12] Sridhar Ramaswamy,et al. On the Discovery of Interesting Patterns in Association Rules , 1998, VLDB.

[13] Jian Pei,et al. Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[14] Mohammed J. Zaki. Generating non-redundant association rules , 2000, KDD '00.

[15] Devavrat Shah,et al. Turbo-charging vertical mining of large databases , 2000, SIGMOD '00.

[16] S YuPhilip,et al. Using a Hash-Based Method with Transaction Trimming for Mining Association Rules , 1997 .

[17] A. K. Pujari,et al. Data Mining Techniques , 2006 .