Class compactness for data clustering

In this paper we introduce a compactness based clustering algorithm. The compactness of a data class is measured by comparing the inter-subset and intra-subset distances. The class compactness of a subset is defined as the ratio of the two distances. A subset is called an isolated cluster (or icluster) if its class compactness is greater than 1. All iclusters make a containment tree. We introduce monotonic sequences of iclusters to simplify the structure of the icluster tree, based on which a clustering algorithm is designed. The algorithm has the following advantages: it is effective on data sets with clusters nonlinearly separated, of arbitrary shapes, or of different densities. The effectiveness of the algorithm is demonstrated by experiments.