Exploring interestingness through clustering: a framework

Determining interestingness is a notoriously difficult problem: it is subjective and elusive to capture. It is also becoming an increasingly more important problem in knowledge discovery from database as the number of mined patterns increases. In this work we introduce and investigate a framework for association rule clustering that enables automating much of the laborious manual effort normally involved in the exploration and understanding of interestingness. Clustering is ideally suited for this task; it is the unsupervised organization of patterns into groups, so that patterns in the same group are more similar to each other than to patterns in other groups. We also define a data-driven inferred labeling of these clusters, the ancestor coverage, which provides an intuitive, concise representation of the clusters.

[1]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[2]  Sigal Sahar Interestingness preprocessing , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[3]  John A. Hartigan,et al.  Clustering Algorithms , 1975 .

[4]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[5]  Jennifer Widom,et al.  Clustering association rules , 1997, Proceedings 13th International Conference on Data Engineering.

[6]  Jiawei Han,et al.  Mining knowledge at multiple concept levels , 1995, CIKM '95.

[7]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[8]  Carlos Bento,et al.  A Metric for Selection of the Most Promising Rules , 1998, PKDD.

[9]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[10]  Vipin Kumar,et al.  Chameleon: Hierarchical Clustering Using Dynamic Modeling , 1999, Computer.

[11]  Renée J. Miller,et al.  Association rules over interval data , 1997, SIGMOD '97.

[12]  Jinyan Li,et al.  Interestingness of Discovered Association Rules in Terms of Neighborhood-Based Unexpectedness , 1998, PAKDD.

[13]  Sigal Sahar,et al.  Interestingness via what is not interesting , 1999, KDD '99.

[14]  Heikki Mannila,et al.  Pruning and grouping of discovered association rules , 1995 .

[15]  Howard J. Hamilton,et al.  Principles for mining summaries using objective measures of interestingness , 2000, Proceedings 12th IEEE Internationals Conference on Tools with Artificial Intelligence. ICTAI 2000.

[16]  John F. Roddick,et al.  Higher Order Mining: Modelling And Mining TheResults Of Knowledge Discovery , 2000 .