Data Mining Techniques for Concisely Representing Patterns Sets

This book focuses on mining frequent itemsets and association rules. A detailed study we carry out shows that closed itemsets and minimal generators play a key role in concisely representing patterns sets. However, an intra-class combinatorial redundancy would logically results from the inherent absence of a unique minimal generator associated to a given closed itemset. In this respect, we propose lossless reductions of the minimal generator set thanks to a new substitution- based process. Our theoretical results will then be extended to the association rule framework. We also lead a thorough exploration of the disjunctive search space, where itemsets are characterized by their respective disjunctive supports, instead of the conjunctive ones. In order to obtain a redundancy-free representation of the disjunctive search space, an interesting solution consists in selecting a unique element to represent itemsets covering the same set of data. We then introduce a new operator dedicated to this task. This operator is at the roots of new concise representations of frequent itemsets and is used for the derivation of generalized association rules.