Analysis of distance for structured multivariate data and extensions to multivariate analysis of variance

Many data sets in practice fit a multivariate analysis of variance (MANOVA) structure but are not consonant with MANOVA assumptions. One particular such data set from economics is described. This set has a 24 factorial design with eight variables measured on each individual, but the application of MANOVA seems inadvisable given the highly skewed nature of the data. To establish a basis for analysis, we examine the structure of distance matrices in the presence of a priori grouping of units and show how the total squared distance among the units of a multivariate data set can be partitioned according to the factors of an external classification. The partitioning is exactly analogous to that in the univariate analysis of variance. It therefore provides a framework for the analysis of any data set whose structure conforms to that of MANOVA, but which for various reasons cannot be analysed by this technique. Descriptive aspects of the technique are considered in detail, and inferential questions are tackled via randomization tests. This approach provides a satisfactory analysis of the economics data.

[1]  Pierre Legendre,et al.  DISTANCE‐BASED REDUNDANCY ANALYSIS: TESTING MULTISPECIES RESPONSES IN MULTIFACTORIAL ECOLOGICAL EXPERIMENTS , 1999 .

[2]  Wojtek J. Krzanowski,et al.  Between-group analysis with heterogeneous covariance matrices: The common principal component model , 1990 .

[3]  Wojtek J. Krzanowski,et al.  Ordination in the presence of group structure, for general multivariate data , 1994 .

[4]  J. Gower Some distance properties of latent root and vector methods used in multivariate analysis , 1966 .

[5]  John D. Hey,et al.  Do Anglo-Saxons free-ride more? , 1997 .

[6]  J. Gower,et al.  The interpretation of Generalized Procrustes Analysis and allied methods , 1991 .

[7]  M. Kendall,et al.  Kendall's advanced theory of statistics , 1995 .

[8]  Ordination between-and within-groups applied to soil classification , 1981 .

[9]  J. Gower Adding a point to vector diagrams in multivariate analysis , 1968 .

[10]  R. Shanmugam Multivariate Analysis: Part 1: Distributions, Ordination and Inference , 1994 .

[11]  John C. Gower,et al.  Measures of Similarity, Dissimilarity and Distance , 1985 .

[12]  B. Manly Randomization, Bootstrap and Monte Carlo Methods in Biology , 2018 .

[13]  C. M. Cuadras,et al.  A distance based regression model for prediction with mixed data , 1990 .

[14]  J. Gower A General Coefficient of Similarity and Some of Its Properties , 1971 .

[15]  Hans-Hermann Bock,et al.  On the Interface between Cluster Analysis, Principal Component Analysis, and Multidimensional Scaling , 1987 .