The limited value of cophenetic correlation as a clustering criterion

Abstract A common procedure for evaluating hierarchical cluster techniques is to compare the input data, in terms of for example a matrix of similarities or dissimilarities, with the output hierarchy expressed in matrix form. If an ordinary product-moment correlation is used for this comparison, the technique is known as that of cophenetic correlations, frequently used by numerical taxonomists. A high correlation between the input similarities and the output dendrogram has been regarded as a criterion of a successful classification. This paper contains a Monte Carlo study of the characteristics of the cophenetic correlation and a related measure of agreement which have been both interpreted in terms of generalized variance for some different hierarchical cluster algorithms. The generalized variance criterion chosen for this study is Wilk's lambda, whose sampling distribution under the null hypothesis of identical group centroids is used in this context to define the degree of separation between clusters. Thus, a probabilistic approach is introduced into the evaluation procedure. With the above definition of presence of clusters, use of the cophenetic correlation and related measures of agreement as criteria of goodness-of-fit is shown to be quite misleading in most cases. This is due to their large variability for low separation of clusters.