The distribution of the variance ratio in random samples of any size drawn from non-normal universes.

With some theoretical results and extensive sampling experiments from populations of known form, E. S. Pearson (1931) has studied the effect of non-normality on the frequency distribution of the variance ratio. In the case of a one-way classification for analysis of variance, he has shown that 'Between-groups' and 'Within-groups' mean squares still continue to provide unbiased estimates of the population variance, but they are no longer independently distributed. However, in view of the fact that the expressions for the first two moments of their ratio (denoted here by w) are, up to certain approximations,.independent of the population fl's, he has inferred that the normal-theory test will not be seriously invalidated, provided the total number of samples is not too small. But in the more general problem, where two essentially different estimates of variance are compared, he has pointed out that the distribution of their ratio (denoted here by v) will be considerably more sensitive to changes in the population form. The sampling investigation of T. Eden & F. Yates (1933) was not of the same kind as that of E. S. Pearson; it was carried out with the object, as M. G. Kendall (1946) has stated, of confirming the z-test (z = I loge w) for data under randomization. The experimental material considered by the authors exhibits a decided skewness, as measured by A3 (= V/,8); but the other measure of deviation, namely, the kurtosis A4 (= /323), the effect of which on w is rather more serious, has not been referred to in the course of their work. R. C. Geary (1947) has derived an approximate formula for the probability correction of w, of which a suggestion has been made for tabulation. He has also furnished asymptotic expressions for the first two moments of z (= o loge v) in samples from any population and discussed some methods for their use in the evaluation of the approximate true probability. His formulae, in both cases, are based on the large sample assumption and consider the effects of kurtosis only. In the present paper the problem is studied theoretically in some detail by deriving the mathematical forms of the distributions of both the test functions w and v for populations characterized by the a priori values of the universal A's and expressed by the first four terms (up to A2) of the Edgeworth series. In addition to the normal-theory function, the frequency density in each case furnishes corrective terms in A4 and A2. The first two moments calculated directly from the derived functions agree up to certain approximations with the results (obtained otherwise by various workers) which are known to be true of any universe. Thus, starting from the Edgeworth series, it seems possible to reach results which in fairly large samples may closely approximate the actual distribution of the variance ratio for any form of population. Formulae for the tail area, derived and tabulated in the text, enable us to examine the true probability of the variance ratio in any size of samples for a priori values of A3 and A4, provided the populations agree well with the first four terms of the Edgeworth series. The same expressions remain valid asymptotically for any universe with finite cumulants, so

[1]  KARL PEARSON,et al.  Tracts for Computers , 1923, Nature.