Between-group analysis with heterogeneous covariance matrices: The common principal component model

Analysis of between-group differences using canonical variates assumes equality of population covariance matrices. Sometimes these matrices are sufficiently different for the null hypothesis of equality to be rejected, but there exist some common features which should be exploited in any analysis. The common principal component model is often suitable in such circumstances, and this model is shown to be appropriate in a practical example. Two methods for between-group analysis are proposed when this model replaces the equal dispersion matrix assumption. One method is by extension of the two-stage approach to canonical variate analysis using sequential principal component analyses as described by Campbell and Atchley (1981). The second method is by definition of a distance function between populations satisfying the common principal component model, followed by metric scaling of the resulting between-populations distance matrix. The two methods are compared with each other and with ordinary canonical variate analysis on the previously introduced data set.

[1]  K. Matusita Decision rule, based on the distance, for the classification problem , 1956 .

[2]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[3]  B. Flury Common Principal Components and Related Multivariate Models , 1988 .

[4]  Solomon Kullback,et al.  Information Theory and Statistics , 1960 .

[5]  L. Lecam On the Assumptions Used to Prove Asymptotic Normality of Maximum Likelihood Estimates , 1970 .

[6]  P. Mahalanobis On the generalized distance in statistics , 1936 .

[7]  N. A. Campbell,et al.  Canonical variate analysis with unequal covariance matrices: Generalizations of the usual solution , 1984 .

[8]  H. Chernoff Some Measures for Discriminating between Normal Multivariate Distributions with Unequal Covariance Matrices , 1973 .

[9]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[10]  R. Reyment Observations on Homogeneity of Covariance Matrices in Paleontologic Biometry , 1962 .

[11]  John C. W. Rayner,et al.  The comparison of sample covariance matrices using likelihood ratio tests , 1987 .

[12]  A. F. Mitchell,et al.  The Mahalanobis distance and elliptic distributions , 1985 .

[13]  K. Matusita Classification based on distance in multivariate Gaussian cases , 1967 .

[14]  Herbert Arkin,et al.  Tables for Statisticians , 1963 .

[15]  John C. Gower,et al.  Statistical methods of comparing different multivariate analyses of the same data , 1971 .

[16]  Wojtek J. Krzanowski,et al.  Principles of multivariate analysis : a user's perspective. oxford , 1988 .

[17]  Douglas B. Clarkson Remark AS R71: A Remark on Algorithm AS 211. The F-G Diagonalization Algorithm , 1988 .

[18]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[19]  H. Jeffreys,et al.  Theory of probability , 1896 .

[20]  R. L. Chaddha,et al.  An Empirical Comparison of Distance Statistics for Populations with Unequal Covariance Matrices , 1968 .

[21]  R. Sibson Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics , 1978 .

[22]  G. Constantine,et al.  The F‐G Diagonalization Algorithm , 1985 .

[23]  Dean M. Young,et al.  Quadratic discrimination: Some results on optimal low-dimensional representation , 1987 .

[24]  D. B. Clarkson A Least Squares Version of Algorithm as 211: The F‐G Diagonalization Algorithm , 1988 .

[25]  Calyampudi R. Rao Diversity and dissimilarity coefficients: A unified approach☆ , 1982 .

[26]  T. W. Anderson,et al.  Classification into two Multivariate Normal Distributions with Different Covariance Matrices , 1962 .

[27]  W. Atchley,et al.  THE GEOMETRY OF CANONICAL VARIATE ANALYSIS , 1981 .