The first step in making inferences under the frequentist system of statistical logic is to propose a null hypothesis. An experiment is then performed, or a set of observations made. The resulting data are subjected to statistical analysis to determine whether the null hypothesis should be rejected or not. If it is, then some alternative hypothesis must have been entertained. In biomedical work, the alternative hypothesis should usually be non-specific and it follows that the statistical test of the null hypothesis should be interpreted in a two-sided fashion. The decision to reject or accept statistical null hypotheses, whether on the basis of a P value or confidence intervals, is probabilistic in nature and always attended by the risk of error. It is argued that, in biomedical research, it is the risk of making false-positive statistical inferences (Type I error) that should be most closely controlled. The risks of Type I error cannot be considered in isolation from the model of inference under which the null hypothesis is tested. That which forms the basis for using the classical t, F and X2 tests is the population model, in which the inference is referred to a defined population that has been randomly sampled and which conforms to a specified frequency distribution. Under this model, serious errors in statistical inference can occur if the actual distributions of the populations do not conform to those specified by theory. More importantly, the population model is inappropriate to most biomedical research, in which treatment groups are created by randomization but not by random sampling.(ABSTRACT TRUNCATED AT 250 WORDS)
[1]
J. Neyman.
On the Two Different Aspects of the Representative Method: the Method of Stratified Sampling and the Method of Purposive Selection
,
1934
.
[2]
E. S. Pearson.
SOME ASPECTS OF THE PROBLEM OF RANDOMIZATION
,
1937
.
[3]
J. Neyman,et al.
FIDUCIAL ARGUMENT AND THE THEORY OF CONFIDENCE INTERVALS
,
1941
.
[4]
Oscar Kempthorne,et al.
THE RANDOMIZATION THEORY OF' EXPERIMENTAL INFERENCE*
,
1955
.
[5]
T. E. Doerfler,et al.
The behaviour of some significance tests under experimental randomization
,
1969
.
[6]
G. Glass,et al.
Consequences of Failure to Meet Assumptions Underlying the Fixed Effects Analyses of Variance and Covariance
,
1972
.
[7]
Paul A. Games,et al.
Robustness of the Analysis of Variance, the Welch Procedure and a Box Procedure to Heterogeneous Variances
,
1974
.
[8]
B. Efron.
Why Isn't Everyone a Bayesian?
,
1986
.
[9]
T. Chalmers,et al.
Meta-analyses of randomized controlled trials.
,
1987,
The New England journal of medicine.
[10]
Rand R. Wilcox,et al.
New designs in analysis of variance.
,
1987
.
[11]
Robert J. Boik,et al.
The Fisher-Pitman permutation test: A non-robust alternative to the normal theory F test when variances are heterogeneous
,
1987
.
[12]
Joseph P. Romano.
On the behaviour of randomization tests without the group invariance assumption
,
1990
.
[13]
J Ludbrook,et al.
ON MAKING MULTIPLE COMPARISONS IN CLINICAL AND EXPERIMENTAL PHARMACOLOGY AND PHYSIOLOGY
,
1991,
Clinical and experimental pharmacology & physiology.