An Empirical Comparison of Distance Statistics for Populations with Unequal Covariance Matrices

Three statistics estimating distance squared between two multivariate normal populations with unequal covariance matrices are empirically compared using two sets of data. The data consist of samples of equal size from the populations. The three statistics are: a. the classical Mahalanobis distance-squared statistic, D2; b. the distance-squared statistic, A2 (Russian D2), introduced by Reyment [1962]; c. a distance-squared statistic, D*2, based on a minimax criterion of classification (Anderson and Bahadur [1962]). It is shown that 2 is not a useful statistic. D2 and D*2 are close in numerical value for the data considered. They can differ considerably for unequal sample sizes as shown in a theorem for a special case of l2 = c2s1. Some arguments in favor of D*2 as an appropriate distance statistic are presented.