论文信息 - MINI: Mining Informative Non-redundant Itemsets

MINI: Mining Informative Non-redundant Itemsets

Frequent itemset mining assists the data mining practitioner in searching for strongly associated items (and transactions) in large transaction databases. Since the number of frequent itemsets is usually extremely large and unmanageable for a human user, recent works have sought to define condensed representations of them, e.g. closedor maximalfrequent itemsets. We argue that not only these methods often still fall short in sufficiently reducing of the output size, but they also output many redundant itemsets. In this paper we propose a philosophically new approach that resolves both these issues in a computationally tractable way. We present and empirically validate a statistically founded approach called MINI, to compress the set of frequent itemsets down to a list of informative and non-redundant itemsets.

Nello Cristianini | Tijl De Bie | Arianna Gallo

[1] Cheng Yang,et al. Efficient discovery of error-tolerant frequent itemsets in high dimensions , 2001, KDD '01.

[2] Toon Calders,et al. Mining All Non-derivable Frequent Itemsets , 2002, PKDD.

[3] Philip S. Yu,et al. Moment: maintaining closed frequent itemsets over a stream sliding window , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[4] Howard J. Hamilton,et al. Interestingness measures for data mining: A survey , 2006, CSUR.

[5] Heikki Mannila,et al. Multiple Uses of Frequent Sets and Condensed Representations (Extended Abstract) , 1996, KDD.

[6] Tomasz Imielinski,et al. Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[7] Nicolas Pasquier,et al. Efficient Mining of Association Rules Using Closed Itemset Lattices , 1999, Inf. Syst..

[8] Jean-François Boulicaut,et al. Free-Sets: A Condensed Representation of Boolean Data for the Approximation of Frequency Queries , 2004, Data Mining and Knowledge Discovery.

[9] Hannu Toivonen,et al. Closed Non-derivable Itemsets , 2006, PKDD.

[10] Jiawei Han,et al. Summarizing itemset patterns: a profile-based approach , 2005, KDD '05.