Mining the smallest association rule set for predictions

Mining transaction databases for association rules usually generates a large number of rules, most of which are unnecessary when used for subsequent prediction. In this paper we define a rule set for a given transaction database that is much smaller than the association rule set but makes the same predictions as the association rule set by the confidence priority. We call this subset the informative rule set. The informative rule set is not constrained to particular target items; and it is smaller than the non-redundant association rule set. We present an algorithm to directly generate the informative rule set, i.e., without generating all frequent itemsets first, and that accesses the database less often than other unconstrained direct methods. We show experimentally that the informative rule set is much smaller than both the association rule set and the non-redundant association rule set, and that it can be generated more efficiently.

[1]  Heikki Mannila,et al.  Pruning and grouping of discovered association rules , 1995 .

[2]  Wynne Hsu,et al.  Pruning and summarizing the discovered associations , 1999, KDD '99.

[3]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[4]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[5]  Roberto J. Bayardo,et al.  Mining the most interesting rules , 1999, KDD '99.

[6]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[7]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[8]  Geoffrey I. Webb Efficient search for association rules , 2000, KDD '00.

[9]  Ron Rymon,et al.  Search through Systematic Set Enumeration , 1992, KR.

[10]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD 2000.

[11]  Srinivasan Parthasarathy,et al.  New Algorithms for Fast Discovery of Association Rules , 1997, KDD.

[12]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[13]  Devavrat Shah,et al.  Turbo-charging vertical mining of large databases , 2000, SIGMOD '00.

[14]  Philip S. Yu,et al.  An effective hash-based algorithm for mining association rules , 1995, SIGMOD '95.

[15]  Shamkant B. Navathe,et al.  An Efficient Algorithm for Mining Association Rules in Large Databases , 1995, VLDB.

[16]  Heikki Mannila,et al.  Efficient Algorithms for Discovering Association Rules , 1994, KDD Workshop.

[17]  Heikki Mannila,et al.  A Perspective on Databases and Data Mining , 1995, KDD.

[18]  Arun N. Swami,et al.  Set-oriented mining for association rules in relational databases , 1995, Proceedings of the Eleventh International Conference on Data Engineering.