Mining Non-Derivable Association Rules

Association rule mining typically results in large amounts of redundant rules. We introduce efficient methods for deriving tight bounds for confidences of association rules, given their subrules. If the lower and upper bounds of a rule coincide, the confidence is uniquely determined by the subrules and the rule can be pruned as redundant, or derivable, without any loss of information. Experiments on real, dense benchmark data sets show that, depending on the case, up to 99–99.99% of rules are derivable. A lossy pruning strategy, where those rules are removed for which the width of the bounded confidence interval is 1 percentage point, reduced the number of rules by a furher order of magnitude. The novelty of our work is twofold. First, it gives absolute bounds for the confidence instead of relying on point estimates or heuristics. Second, no specific inference system is assumed for computing the bounds; instead, the bounds follow from the definition of association rules. Our experimental results demonstrate that the bounds are usually narrow and the approach has great practical significance, also in comparison to recent related approaches.

[1]  Tomasz Imielinski,et al.  Database Mining: A Performance Perspective , 1993, IEEE Trans. Knowl. Data Eng..

[2]  Heikki Mannila,et al.  Finding interesting rules from large sets of discovered association rules , 1994, CIKM '94.

[3]  J. Galambos,et al.  Bonferroni-type inequalities with applications , 1996 .

[4]  Paul Embrechts,et al.  Bonferroni-Type Inequality With Applications. , 1997 .

[5]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[6]  R. Ng,et al.  Exploratory Mining and Pruning Optimizations of Constrained Association Rules , 1998, SIGMOD Conference.

[7]  Heikki Mannila,et al.  Prediction with local patterns using cross-entropy , 1999, KDD '99.

[8]  Nicolas Pasquier,et al.  Discovering Frequent Closed Itemsets for Association Rules , 1999, ICDT.

[9]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[10]  Bart Goethals,et al.  On Supporting Interactive Association Rule Mining , 2000, DaWaK.

[11]  Gerd Stumme,et al.  Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets , 2000, Computational Logic.

[12]  Jean-François Boulicaut,et al.  Approximation of Frequency Queris by Means of Free-Sets , 2000, PKDD.

[13]  Szymon Jaroszewicz,et al.  Pruning Redundant Association Rules Using Maximum Entropy Principle , 2002, PAKDD.

[14]  Toon Calders,et al.  Mining All Non-derivable Frequent Itemsets , 2002, PKDD.

[15]  Mohammed J. Zaki,et al.  MIRAGE : A Framework for Mining , Exploring and Visualizing Minimal Association Rules ∗ , 2003 .

[16]  Howard J. Hamilton,et al.  Basic Association Rules , 2004, SDM.