On the Independence of k Sets of Normally Distributed Statistical Variables

IN SUCH fields of investigation as economics, psychology, and anthropology, where observations on several variables are taken into account simultaneously, it is at least as important to study relationships among the variables as to consider the variables separately. In fact, if there are significant relationships within a system of variables, a considerable part of the information furnished by the observations will be lost unless the relationships are taken into account. In general, very little is known a priori about such a set of variables, and hence our knowledge of them and their various interrelations must be inferred from observations. Questions relating to the problem of making inferences from observations resolve themselves into those of, (1) devising suitable functions of the observations for estimating parameters which characterize the hypothetical population of the variables and (2) determining frequency laws from which the degree of credibility to be placed in the departure of these functions from expectation can be evaluated. The more complicated the hypothesis concerning the interrelations of the variables, the more complex, of course, will be the functions of observations for measuring the relationships and testing the hypothesis. It frequently happens in multivariate analysis that a number of variables can be rationally classed into several mutually exclusive categories. For example, certain measurable traits of individuals may be classed as physical or mental. In the study of wholesale prices of farm products in a certain region over a certain period of time, the products may be classed as (1) fruits, (2) vegetables, or (3) dairy products, and the deviations of the prices of products within each group from seasonal and secular trends may be taken as the variables. When variables can be grouped in such a manner the question naturally arises as to whether or not there is any significant relationship between the groups of variables. That is, on the basis of the available observations, with what degree of credibility can we assert that the groups are mutually independent, so that knowledge relative to one of the groups gives us no significant information about the others? If they are significantly non-independent how can we measure the amount of dependence? It will become apparent as we proceed that statistical functions' and significance tests more general and comprehensive than I See R. Frisch, "Correlation and Scatter in Statistical Variables," Nordielk