The alleged robustness of Z, t, and F tests against nonnormality and, when sample sizes are equal, of t and F tests against heterogeneity as well was investigated in a large-scale sampling study under conditions realistic to experimentation and testing in the behavioral sciences. Factors varied were: population shape (L or bell), σ1/σ2 (1/2, 1, or 2), size N of smallest sample (2, 4, 8, 16, 32, 64, 128, 256, 512, or 1,024), N1/N2 (1/3,1/2,1, 2, or 3), α (.05,.01, or.001), and test tailedness (left, right, or two). In about 25% of the situations investigated, the test failed to meet a very lax criterion for robustness at every examined N value less than 100, and in 8% at every value less than 1,000; no test met the criterion in all of the situations studied before N=512. Robustness was strongly influenced by all of the factors investigated, and interactions among the influencing factors were often strong and complex.
[1]
C. A. Boneau,et al.
The effects of violations of assumptions underlying the test.
,
1960,
Psychological bulletin.
[2]
G. Glass,et al.
Statistical methods in education and psychology
,
1970
.
[3]
James V. Bradley,et al.
A Common Situation Conducive to Bizarre Distribution Shapes
,
1977
.
[4]
J. V. Bradley.
Nonrobustness in one-sample Z and t tests: A large-scale sampling study
,
1980
.
[5]
J. V. Bradley.
Distribution-Free Statistical Tests
,
1968
.
[6]
H. Scheffé.
The Analysis of Variance
,
1960
.
[7]
James V. Bradley,et al.
Probability, decision, statistics
,
1976
.
[8]
James V. Bradley.
Nonrobustness in classical tests on means and variances: A large-scale sampling study
,
1980
.