论文信息 - A framework for mining interesting pattern sets

A framework for mining interesting pattern sets

This paper suggests a framework for mining subjectively interesting pattern sets that is based on two components: (1) the encoding of prior information in a model for the data miner's state of mind; (2) the search for a pattern set that is maximally informative while efficient to convey to the data miner. We illustrate the framework with an instantiation for tile patterns in binary databases where prior information on the row and column marginals is available. This approach implements step (1) above by constructing the MaxEnt model with respect to the prior information [2, 3], and step (2) by relying on concepts from information and coding theory. We provide a brief overview of a number of possible extensions and future research challenges, including a key challenge related to the design of empirical evaluations for subjective interestingness measures.

Tijl De Bie | Eirini Spyropoulou | Kleanthis-Nikolaos Kontonasios

[1] Leon Gordon Kraft,et al. A device for quantizing, grouping, and coding amplitude-modulated pulses , 1949 .

[2] Abraham Silberschatz,et al. On Subjective Measures of Interestingness in Knowledge Discovery , 1995, KDD.

[3] Balaji Padmanabhan,et al. A Belief-Driven Method for Discovering Unexpected Patterns , 1998, KDD.

[4] Balaji Padmanabhan,et al. Small is beautiful: discovering the minimal set of unexpected patterns , 2000, KDD '00.

[5] Mohammed J. Zaki,et al. CHARM: An Efficient Algorithm for Closed Itemset Mining , 2002, SDM.

[6] Szymon Jaroszewicz,et al. Interestingness of frequent itemsets using Bayesian networks as background knowledge , 2004, KDD.

[7] Bart Goethals,et al. Tiling Databases , 2004, Discovery Science.

[8] J. Winderickx,et al. Inferring transcriptional modules from ChIP-chip, motif and microarray data , 2006, Genome Biology.

[9] Jilles Vreeken,et al. Item Sets that Compress , 2006, SDM.

[10] Howard J. Hamilton,et al. Interestingness measures for data mining: A survey , 2006, CSUR.

[11] Luc De Raedt,et al. Constraint-Based Pattern Set Mining , 2007, SDM.