About the Lossless Reduction of the Minimal Generator Family of a Context

Minimal generators (MGs), aka minimal keys, play an important role in many theoretical and practical problem settings involving closure systems that originate in graph theory, relational database design, data mining, etc. As minima of the equivalence classes associated to closures, MGs underlie many compressed representations: For instance, they form premises in canonical implication/ association rules - with closures as conclusions - that losslessly represent the entire rule family of a closure system. However, MGs often show an intra-class combinatorial redundancy that makes an exhaustive storage and use impractical. In this respect, the succinct system of minimal generators (SSMG) recently introduced by Dong et al. is a first step towards a lossless reduction of this redundancy. However, as shown elsewhere, some of the claims about SSMG, e.g., its invariant size and lossless nature, do not hold. As a remedy, we propose here a new succinct family which restores the losslessness by adding few further elements to the SSMG core, while theoretically grounding the whole. Computing means for the new family are presented together with the empirical evidences about its relative size w.r.t. the entire MG family and similar structures from the literature.

[1]  Luc De Raedt,et al.  Constraint-Based Mining and Inductive Databases, European Workshop on Inductive Databases and Constraint Based Mining, Hinterzarten, Germany, March 11-13, 2004, Revised Selected Papers , 2005, Constraint-Based Mining and Inductive Databases.

[2]  Luís Moniz Pereira,et al.  Computational Logic — CL 2000 , 2000, Lecture Notes in Computer Science.

[3]  Jan Komorowski,et al.  Principles of Data Mining and Knowledge Discovery , 2001, Lecture Notes in Computer Science.

[4]  Henry Rouanet,et al.  Algèbre linéaire et formalisation de la notion de comparaison , 1968 .

[5]  Jiuyong Li On optimal rule discovery , 2006 .

[6]  Engelbert Mephu Nguifo,et al.  Generic Association Rule Bases: Are They so Succinct? , 2006, CLA.

[7]  Nicolas Pasquier,et al.  Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[8]  Luc De Raedt,et al.  Condensed Representations for Inductive Logic Programming , 2004, KR.

[9]  Jean-François Boulicaut,et al.  A Survey on Condensed Representations for Frequent Sets , 2004, Constraint-Based Mining and Inductive Databases.

[10]  Jinyan Li,et al.  Relative risk and odds ratio: a data mining perspective , 2005, PODS '05.

[11]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[12]  Jian Pei,et al.  Minimum Description Length Principle: Generators Are Preferable to Closed Patterns , 2006, AAAI.

[13]  Toon Calders,et al.  Mining All Non-derivable Frequent Itemsets , 2002, PKDD.

[14]  Jian Pei,et al.  Mining Succinct Systems of Minimal Generators of Formal Concepts , 2005, DASFAA.

[15]  Vincent Duquenne,et al.  Familles minimales d'implications informatives résultant d'un tableau de données binaires , 1986 .

[16]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[17]  Bernard Monjardet MATHÉMATIQUES ET SCIENCES HUMAINES , 1977 .

[18]  Gerd Stumme,et al.  Computing iceberg concept lattices with T , 2002, Data Knowl. Eng..

[19]  Sadok Ben Yahia,et al.  Prince: An Algorithm for Generating Rule Bases Without Closure Computations , 2005, DaWaK.

[20]  Engelbert Mephu Nguifo,et al.  Succinct System of Minimal Generators: A Thorough Study, Limitations and New Definitions , 2006, CLA.

[21]  Petko Valtchev,et al.  On Computing the Minimal Generator Family for Concept Lattices and Icebergs , 2005, ICFCA.

[22]  Gerd Stumme,et al.  Mining frequent patterns with counting inference , 2000, SKDD.

[23]  Lhouari Nourine,et al.  Uncovering and Reducing Hidden Combinatorics in Guigues-Duquenne Bases , 2005, ICFCA.

[24]  W. W. Armstrong,et al.  Dependency Structures of Data Base Relationships , 1974, IFIP Congress.

[25]  Gerd Stumme,et al.  Mining Minimal Non-redundant Association Rules Using Frequent Closed Itemsets , 2000, Computational Logic.

[26]  Mohammed J. Zaki Mining Non-Redundant Association Rules , 2004, Data Min. Knowl. Discov..