CCAIIA: Clustering Categorial Attributed into Interseting Accociation Rules

We investigate the problem of mining interesting association rules over a pair of categorical attributes at any level of data granularity. We do this by integrating the rule discovery process with a form of clustering. This allows associations between groups of ;items to be formed where the groping of items is based on maximising the “interestingness” of the associations discovered. Previous work on mining generalised associations assumes either a distance metric on the attribute values or a taxonomy over the items mined. These methods use the metric/taxonomy to limit the space of possible associations that can be found. We develop a measure of the interestingness of association rules based on support and the dependency between the item sets and use this measure to guide the search. We apply the method to a data set and observe the extraction of “interesting” associations. This method could allow interesting and unexpected associations to be discovered as the search space is not being limited by user defined hierarchies.

[1]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[2]  Philip S. Yu,et al.  Data Mining: An Overview from a Database Perspective , 1996, IEEE Trans. Knowl. Data Eng..

[3]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[4]  Yasuhiko Morimoto,et al.  Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization , 1996, SIGMOD '96.

[5]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[6]  Ramakrishnan Srikant,et al.  Mining quantitative association rules in large relational tables , 1996, SIGMOD '96.

[7]  Ramakrishnan Srikant,et al.  Mining generalized association rules , 1995, Future Gener. Comput. Syst..

[8]  Ryszard S. Michalski,et al.  Automated Construction of Classifications: Conceptual Clustering Versus Numerical Taxonomy , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Douglas H. Fisher,et al.  Improving Inference through Conceptual Clustering , 1987, AAAI.

[10]  Jiawei Han,et al.  Discovery of Multiple-Level Association Rules from Large Databases , 1995, VLDB.

[11]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[12]  Hannu Toivonen,et al.  Sampling Large Databases for Association Rules , 1996, VLDB.

[13]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.