Visualizing association mining results through hierarchical clusters

We propose a new methodology for visualizing association mining results. Inter-item distances are computed from combinations of itemset supports. The new distances retain a simple pairwise structure, and are consistent with important frequently occurring itemsets. Thus standard tools of visualization, e.g. hierarchical clustering dendrograms can still be applied, while the distance information upon which they are based is richer. Our approach is applicable to general association mining applications, as well as applications involving information spaces modeled by directed graphs, e.g. the Web. In the context of collections of hypertext documents, the inter-document distances capture the information inherent in a collection's link structure, a form of link mining. We demonstrate our methodology with document sets extracted from the Science Citation Index, applying a metric that measures consistency between clusters and frequent itemsets.