Two separate effects of variance heterogeneity on the validity and power of significance tests of location

Abstract Heterogeneity of variances of treatment groups influences the validity and power of significance tests of location in two distinct ways. First, if sample sizes are unequal, the Type I error rate and power are depressed if a larger variance is associated with a larger sample size, and elevated if a larger variance is associated with a smaller sample size. This well-established effect, which occurs in t and F tests, and to a lesser degree in nonparametric rank tests, results from unequal contributions of pooled estimates of error variance in the computation of test statistics. It is observed in samples from normal distributions, as well as non-normal distributions of various shapes. Second, transformation of scores from skewed distributions with unequal variances to ranks produces differences in the means of the ranks assigned to the respective groups, even if the means of the initial groups are equal, and a subsequent inflation of Type I error rates and power. This effect occurs for all sample sizes, equal and unequal. For the t test, the discrepancy diminishes, and for the Wilcoxon–Mann–Whitney test, it becomes larger, as sample size increases. The Welch separate-variance t test overcomes the first effect but not the second. Because of interaction of these separate effects, the validity and power of both parametric and nonparametric tests performed on samples of any size from unknown distributions with possibly unequal variances can be distorted in unpredictable ways.

[1]  Satterthwaite Fe An approximate distribution of estimates of variance components. , 1946 .

[2]  Anthony S. Bryk,et al.  Heterogeneity of Variance in Experimental Studies: A Challenge to Conventional Interpretations , 1988 .

[3]  C. A. Boneau,et al.  The effects of violations of assumptions underlying the test. , 1960, Psychological bulletin.

[4]  H. J. Whitford,et al.  How to Use the Two Sample t‐Test , 1986 .

[5]  D. Owen,et al.  Handbook of statistical distributions , 1978 .

[6]  D. W. Zimmerman,et al.  Conditional Probabilities of Rejecting H 0 by Pooled and Separate-Variances t Tests Given Heterogeneity of Sample Variances , 2004 .

[7]  G. Glass,et al.  Consequences of Failure to Meet Assumptions Underlying the Fixed Effects Analyses of Variance and Covariance , 1972 .

[8]  M. Evans Statistical Distributions , 2000 .

[9]  Rand R. Wilcox,et al.  Trimming, Transforming Statistics, And Bootstrapping: Circumventing the Biasing Effects Of Heterescedasticity And Nonnormality , 2002 .

[10]  T. P. Hettmansperger,et al.  Statistical Inference Based on Ranks. , 1985 .

[11]  George Marsaglia,et al.  Toward a universal random number generator , 1987 .

[12]  Gideon Keren,et al.  A Handbook for data analysis in the behavioral sciences : methodological issues , 1993 .

[13]  P. J. Huber Robust Statistical Procedures , 1977 .

[14]  John W. Pratt,et al.  Obustness of Some Procedures for the Two-Sample Location Problem , 1964 .

[15]  H. Keselman,et al.  Is the ANOVA F-Test Robust to Variance Heterogeneity When Sample Sizes are Equal?: An Investigation via a Coefficient of Variation , 1977 .

[16]  H. Scheffé,et al.  The Analysis of Variance , 1960 .

[17]  T. A. Bray,et al.  A Convenient Method for Generating Normal Variables , 1964 .

[18]  Henry R. Neave,et al.  A Monte Carlo Study Comparing Various Two-Sample Tests for Differences in Mean , 1968 .

[19]  R. Pro Handbook of statistical distributions por J. K. Patel, C. H. Kapadia y D. B. Owen. Edit , 1978 .

[20]  J E Overall,et al.  Tests That are Robust against Variance Heterogeneity in k × 2 Designs with Unequal Cell Frequencies , 1995, Psychological reports.

[21]  Philip H. Ramsey Exact Type 1 Error Rates for Robustness of Student's t Test with Unequal Variances , 1980 .

[22]  K. J. Levy,et al.  An empirical comparison of the ANOVA F-test with alternatives which are more robust against heterogeneity of variance , 1978 .

[23]  Heleno Bolfarine,et al.  Population variance prediction under normal dynamic superpopulation models , 1989 .

[24]  Douglas A. Wolfe,et al.  Introduction to the Theory of Nonparametric Statistics. , 1980 .

[25]  J. L. Hodges,et al.  The Efficiency of Some Nonparametric Competitors of the t-Test , 1956 .

[26]  S. Zaremba Note on the Wilcoxon-Mann-Whitney Statistic , 1965 .

[27]  B. L. Welch THE SIGNIFICANCE OF THE DIFFERENCE BETWEEN TWO MEANS WHEN THE POPULATION VARIANCES ARE UNEQUAL , 1938 .