Multivariate Tests for Clusters

Abstract Criteria are considered for testing the hypothesis (in multiresponse data) that the observations are a random sample from one multi-normal population versus the alternative that, for some partition of the data, the observations arise from two multinormal populations with different means. This hypothesis is analogous to the univariate formulation of Engelman and Hartigan (1969), in which they studied a likelihood ratio test for clusters. A union-intersection (UI) criterion is developed that is more manageable computationally than the multivariate likelihood ratio (LR) criterion. The UI and LR criteria and a “linear discrimination” statistic are shown, however, to be equivalent. Some properties of the tests are provided.