Cluster analysis is concerned with the problem of partitioning a given set of entities into homogeneous and well-separated subsets called clusters. The concepts of homogeneity and of separation can be made precise when a measure of dissimilarity between the entities is given. Let us define the diameter of a partition of the given set of entities into clusters as the maximum dissimilarity between any pair of entities in the same cluster and the split of a partition as the minimum dissimilarity between entities in different clusters. The problems of determining a partition into a given number of clusters with minimum diameter (i.e., a partition of maximum homogeneity) or with maximum split (i.e., a partition of maximum separation) are first considered. It is shown that the latter problem can be solved by the classical single-link clustering algorithm, while the former can be solved by a graph-theoretic algorithm involving the optimal coloration of a sequence of partial graphs, described in more detail in a previous paper. A partition into a given number of clusters will be called efficient if and only if there exists no partition into at most the same number of clusters with smaller diameter and not smaller split or with larger split and not larger diameter. Two efficient partitions are called equivalent if and only if they have the same values for the split and for the diameter.
[1]
Michael R. Anderberg,et al.
Cluster Analysis for Applications
,
1973
.
[2]
B. S. Duran,et al.
Cluster Analysis: A Survey
,
1974
.
[3]
R. M. Cormack,et al.
A Review of Classification
,
1971
.
[4]
R. Fisher.
THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS
,
1936
.
[5]
Robin Sibson,et al.
SLINK: An Optimally Efficient Algorithm for the Single-Link Cluster Method
,
1973,
Comput. J..
[6]
R. Prim.
Shortest connection networks and some generalizations
,
1957
.
[7]
Enrique H. Ruspini,et al.
Numerical methods for fuzzy clustering
,
1970,
Inf. Sci..
[8]
L. Hubert.
SPANNING TREES AND ASPECTS OF CLUSTERING
,
1974
.
[9]
J. Gower,et al.
Minimum Spanning Trees and Single Linkage Cluster Analysis
,
1969
.
[10]
L. Hubert.
Some applications of graph theory to clustering
,
1974
.
[11]
C. Berge.
Graphes et hypergraphes
,
1970
.
[12]
M. Rao.
Cluster Analysis and Mathematical Programming
,
1971
.
[13]
Charles T. Zahn,et al.
Graph-Theoretical Methods for Detecting and Describing Gestalt Clusters
,
1971,
IEEE Transactions on Computers.
[14]
S. C. Johnson.
Hierarchical clustering schemes
,
1967,
Psychometrika.
[15]
K. Florek,et al.
Sur la liaison et la division des points d'un ensemble fini
,
1951
.
[16]
P. Hansen,et al.
Complete-Link Cluster Analysis by Graph Coloring
,
1978
.