Finding 'superclassifications' with an acceptable misclassification rate

Cluster analysis methods are based on measures of 'distance' between objects. Sometimes the objects have an internal structure, and use of this can be made when defining such distances. This leads to non-standard cluster analysis methods. We illustrate with an application in which the objects are themselves classes and the aim is to produce clusters of classes which minimize the error rate of a supervised classification rule. For supervised classification problems with more than a handful of classes, there may exist groups of classes which are well separated from other groups, even though individual classes are not all well separated. In such cases, the overall misclassification rate is a crude measure of performance and more subtle measures, taking note of subgroup separation, are desirable. The fact that points can be assigned accurately to groups, if not to individual classes, can sometimes be practically useful.