Preliminary Goodness-of-Fit Tests for Normality do not Validate the One-Sample Student t

One of the most basic topics in many introductory statistical methods texts is inference for a population mean, μ. The primary tool for confidence intervals and tests is the Student t sampling distribution. Although the derivation requires independent identically distributed normal random variables with constant variance, σ2, most authors reassure the readers about some robustness to the normality and constant variance assumptions. Some point out that if one is concerned about assumptions, one may statistically test these prior to reliance on the Student t. Most software packages provide optional test results for both (a) the Gaussian assumption and (b) homogeneity of variance. Many textbooks advise only informal graphical assessments, such as certain scatterplots for independence, others for constant variance, and normal quantile–quantile plots for the adequacy of the Gaussian model. We concur with this recommendation. As convincing evidence against formal tests of (a), such as the Shapiro–Wilk, we offer a simulation study of the tails of the resulting conditional sampling distributions of the Studentized mean. We analyze the results of systematically screening all samples from normal, uniform, exponential, and Cauchy populations. This pretest does not correct the erroneous significance levels and makes matters worse for the exponential. In practice, we conclude that graphical diagnostics are better than a formal pretest. Furthermore, rank or permutation methods are recommended for exact validity in the symmetric case.

[1]  Ralph B. D'Agostino,et al.  Goodness-of-Fit-Techniques , 2020 .

[2]  E. Lehmann,et al.  Nonparametrics: Statistical Methods Based on Ranks , 1976 .

[3]  M. Stephens EDF Statistics for Goodness of Fit and Some Comparisons , 1974 .

[4]  R. Easterling,et al.  The effect of preliminary normality goodness of fit tests on subsequent inference. , 1978 .

[5]  F. Ramsey,et al.  The statistical sleuth : a course in methods of data analysis , 2002 .

[6]  B. Bowerman Statistical Design and Analysis of Experiments, with Applications to Engineering and Science , 1989 .

[7]  Weiwen Miao,et al.  On the Use of the Shapiro‐Wilk Test in Two‐Stage Adaptive Inference for Paired Data from Moderate to Very Heavy Tailed Distributions , 2003 .

[8]  S. Shapiro,et al.  An Analysis of Variance Test for Normality (Complete Samples) , 1965 .

[9]  P. John Statistical Design and Analysis of Experiments , 1971 .

[10]  S. Shapiro,et al.  An analysis of variance test for normality ( complete samp 1 es ) t , 2007 .

[11]  John W. Tukey,et al.  Graphic Comparisons of Several Linked Aspects: Alternatives and Suggested Principles , 1993 .

[12]  M. D. Ernst Permutation Methods: A Basis for Exact Inference , 2004 .

[13]  J. R. Michael The stabilized probability plot , 1983 .

[14]  Martin L. Hazelton A Graphical Tool for Assessing Normality , 2003 .

[15]  D. A. Sprott,et al.  The Difference Between Two Normal Means , 1993 .

[16]  Robert G. Easterling,et al.  Goodness of Fit and Parameter Estimation , 1976 .

[17]  Anderson-Darling : A Goodness of Fit Test for Small Samples Assumptions , 2022 .