Concise representations for approximate association rules

The quality of association rule mining has drawn more and more attention recently. One problem with the quality of the discovered association rules is the huge size of the extracted rule set. Often for a dataset, a huge number of rules can be extracted, but many of them can be redundant to other rules and thus useless in practice. Mining non-redundant rules is a promising approach to solve this problem. In this paper, we firstly propose a definition for redundancy; then we propose a concise representation called reliable basis for representing non-redundant association rules for both exact rules and approximate rules. We prove that the redundancy elimination based on the reliable basis does not reduce the belief to the extracted rules. We also prove that all association rules can be deduced from the reliable basis. Therefore the reliable basis is a lossless representation of association rules. Experimental results show that the reliable basis significantly reduces the number of extracted rules.

[1]  Bart Goethals,et al.  Mining Non-Derivable Association Rules , 2005, SDM.

[2]  Bernhard Ganter,et al.  Formal Concept Analysis: Mathematical Foundations , 1998 .

[3]  Yue Xu,et al.  Generating concise association rules , 2007, CIKM '07.

[4]  Toon Calders,et al.  Mining All Non-derivable Frequent Itemsets , 2002, PKDD.

[5]  Rudolf Wille,et al.  Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts , 2009, ICFCA.

[6]  Dimitrios Gunopulos,et al.  Constraint-Based Rule Mining in Large, Dense Databases , 2004, Data Mining and Knowledge Discovery.

[7]  Edward H. Shortliffe,et al.  A model of inexact reasoning in medicine , 1990 .

[8]  Laks V. S. Lakshmanan,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998, SIGMOD '98.

[9]  Laks V. S. Lakshmanan,et al.  Constraint-Based Multidimensional Data Mining , 1999, Computer.

[10]  Jiawei Han,et al.  Mining Multiple-Level Association Rules in Large Databases , 1999, IEEE Trans. Knowl. Data Eng..

[11]  Mohammed J. Zaki Generating non-redundant association rules , 2000, KDD '00.

[12]  Mohammed J. Zaki Mining Non-Redundant Association Rules , 2004, Data Min. Knowl. Discov..

[13]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[14]  Marzena Kryszkiewicz,et al.  Dataless Transitions Between Concise Representations of Frequent Patterns , 2004, Journal of Intelligent Information Systems.

[15]  Ramakrishnan Srikant,et al.  Mining Association Rules with Item Constraints , 1997, KDD.

[16]  HanJiawei,et al.  Exploratory mining and pruning optimizations of constrained associations rules , 1998 .

[17]  Gerd Stumme,et al.  Generating a Condensed Representation for Association Rules , 2005, Journal of Intelligent Information Systems.

[18]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[19]  Fumio Harashima,et al.  IEEE International Conference on Systems, Man, and Cybernetics , 2000 .

[20]  Rajeev Motwani,et al.  Dynamic itemset counting and implication rules for market basket data , 1997, SIGMOD '97.