A Multi-Relational Hierarchical Clustering Algorithm Based on Shared Nearest Neighbor Similarity

The clustering about relational databases is an active study subject in data mining. In this paper, we introduce a multi-relational hierarchical clustering algorithm based on shared nearest neighbor similarity (MHSNNS). First, this algorithm joins every table through the tuple 1D propagation. Then, groups objects into a large number of relatively small sub-clusters using the shared nearest neighbor algorithm and the cluster cohesion. Last, find the genuine clusters by repeatedly combining these sub-clusters using the cluster separation. The experiment shows the efficiency and scalability of this approach.