Statistical comparison of univariate tests of homogeneity of variances

This paper compares empirical type I error and power of different tests that have been proposed to assess the homogeneity of within-group variances, prior to anova. The tests of homogeneity of variance (THV) compared in this study are: Bartlett's test, the Scheffé-Box log-anova test, Cochran’s C test and Box’s M test, in their parametric and permutational forms. The main questions addressed in the paper are: (1) under what conditions is heterogeneity of variances really a problem in anova, and (2) under these conditions, is any of the THVs useful for the detection of heterogeneity? A preliminary simulation study confirmed that anova is very sensitive to heterogeneity of the variances, even when the data are normally distributed. Any pattern of heteroscedasticity results in inflated type I error, the worst results occurring when one variance is larger than the others. A second study was conducted to find out which tests of homogeneity of variances should be used under extreme conditions (small sample sizes, non-normal distributions). The best overall methods are Bartlett's or Box's tests; even with normally distributed data, one should avoid Cochran's test which is only sensitive to a single high variance, as well as the loganova test because it has low power with small to moderate sample sizes. With non-normal data, Bartlett's and Box's tests can be used if the samples are fairly large. Species abundancelike data should be log-transformed and subjected to parametric or permutational Bartlett's or Box's tests. An Appendix presents a comparison of the Welch-corrected t -test with the parametric and permutational forms of the -t st. The test with Welch correction is useful when the data are normal, sample sizes are small, and the variances are heterogeneous. Otherwise, use the parametric t -test for normal data, or the permutational t -test for skewed data. For heteroscedastic data that cannot be normalized, a nonparametric test should be used.