Consider two data matrices on the same sample of n individuals, X(p x n), Y(q x n). From these matrices, geometrical representations of the sample are obtained as two configurations of n points, in Rp and Rq It is shown that the RV‐coefficient (Escoufier, 1970, 1973) can be used as a measure of similarity of the two configurations, taking into account the possibly distinct metrics to be used on them to measure the distances between points. The purpose of this paper is to show that most classical methods of linear multivariate statistical analysis can be interpreted as the search for optimal linear transformations or, equivalently, the search for optimal metrics to apply on two data matrices on the same sample; the optimality is defined in terms of the similarity of the corresponding configurations of points, which, in turn, calls for the maximization of the associated RV‐coefficient. The methods studied are principal components, principal components of instrumental variables, multivariate regression, canonical variables, discriminant analysis; they are differentiated by the possible relationships existing between the two data matrices involved and by additional constraints under which the maximum of RV is to be obtained. It is also shown that the RV‐coefficient can be used as a measure of goodness of a solution to the problem of discarding variables.
[1]
Joseph L. Zinnes,et al.
Theory and Methods of Scaling.
,
1958
.
[2]
T. W. Anderson,et al.
An Introduction to Multivariate Statistical Analysis
,
1959
.
[3]
Calyampudi R. Rao.
The use and interpretation of principal component analysis in applied research
,
1964
.
[4]
J. Kruskal.
Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis
,
1964
.
[5]
Yves Escoufier,et al.
Echantillonnage dans une population de variables aleatoires Reelles
,
1970
.
[6]
John C. Gower,et al.
Statistical methods of comparing different multivariate analyses of the same data
,
1971
.
[7]
Y. Escoufier.
LE TRAITEMENT DES VARIABLES VECTORIELLES
,
1973
.