Hypotheses, errors, and statistical assumptions

SEAMAN AND JAEGER (1990) contend that presumptuous use of parametric statistical methods to test hypotheses can lead ecologists astray; I wholeheartedly concur. As they point out, the usual argument against nonparametric alternatives is that they lack power-there is a high probability of failing to reject a false null hypothesis (type II error). To increase the power of a particular test (lower probability of type II error), however, one must generally increase probability of type I error, the probability of rejecting a true null hypothesis (Conover, 1980). Seaman and Jaeger (1990) suggest that the view that nonparametric tests are less powerful than parametric alternatives is often untrue and rests in many instances on assumptions about the shape of an underlying distribution that are either incorrect or untestable. In addition, even in situations in which nonparametric tests are somewhat less powerful than parametric ones, there may still be a good reason for ecologists to favor the former-relative costs of the two kinds of errors (Connor and Simberloff, 1986; Toft and Shea, 1983). It is important to distinguish between statistical hypotheses and scientific hypotheses (Boen, 1989; Connor and Simberloff, 1986). Scientific hypotheses are about phenomena in nature. Statistical hypotheses are about properties of populations based on samples. Thus, a statistical hypothesis can be a quantified application of a scientific hypothesis to a specific set of data. Rejection of one or more statistical hypotheses would constitute one piece of evidence to be weighed in deciding whether to reject a scientific hypothesis. A related distinction is between global and local hypotheses (Dolby, 1982). A global hypothesis applies to all of nature, while a local one applies to particular systems. A specific statistical hypothesis is local unless global populations have been sampled. Error classification was developed specifically for testing statistical hypotheses, but the terms are used metaphorically, and appropriately, for global, scientific hypotheses. In either case, if the null hypothesis is that some process or phenomenon has no effect, then a type I error consists in concluding that the process does have an effect when, in fact, it does not. A type II error for the same case would be concluding that the process has no effect when it actually does. Finally, a distinction must be made between classical statistical hypothesis-testing and decision theory (Kyburg, 1974). Testing statistical hypotheses is often an aid to inferring whether a scientific hypothesis is true or false, as noted above. It does not explicitly take account of costs of errors. By contrast, in some settings a particular statistical hypothesis may be tested over and over again, and the test results acted on each time. For example, one may test samples from batches of computer chips, then save or discard entire batches based on test results. This problem has motivated the development of decision theory, which is a theory of acting rationally (with respect to expected losses and gains) in the face of uncertainty, rather than a theory of inference about nature. If statistics is to be used for this purpose, costs attending various kinds of errors must be explicitly assessed. Below I argue that costs of errors must also be considered when statistical hypotheses are tested as part of testing scientific hypotheses. Perusing the medical literature nowadays, one can easily conclude that type II error is the greater scourge and entails the larger costs (e.g., Freiman et al., 1978; Marks et al., 1988). Bourne (1987) went so far as to title a paper, "No Statistically Significant Difference?: So What?" His particular hypothetical example was a test of whether a particular treatment of a disease is efficacious, and he betrayed his underlying reasoning at the outset (p. 40):

[1]  Stuart L. Pimm,et al.  On the Risk of Extinction , 1988, The American Naturalist.

[2]  C Poole,et al.  Beyond the confidence interval. , 1987, American journal of public health.

[3]  W M Bourne 'No statistically significant difference'. So what? , 1987, Archives of ophthalmology.

[4]  C. Toft,et al.  Detecting Community-Wide Patterns: Estimating Power Strengthens Statistical Inference , 1983, The American Naturalist.

[5]  S. Goodman,et al.  Evidence and scientific research. , 1988, American journal of public health.

[6]  G. R. Dolby The Role of Statistics in the Methodology of the Life Sciences , 1982 .

[7]  T C Chalmers,et al.  The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial. Survey of 71 "negative" trials. , 1978, The New England journal of medicine.

[8]  Michael E. Gilpin,et al.  17. Are Species Co-occurrences on Islands Non-random, and Are Null Hypotheses Useful in Community Ecology? , 1984 .

[9]  Henry Ely Kyburg,et al.  The logical foundations of statistical inference , 1974 .

[10]  J C Bailar,et al.  Interactions between statisticians and biomedical journal editors. , 1988, Statistics in medicine.

[11]  H. Wulff,et al.  What do doctors know about statistics? , 1987, Statistics in medicine.

[12]  W. Browner,et al.  Are all significant P values created equal? The analogy between diagnostic tests and clinical research. , 1987, JAMA.

[13]  J R Boen Understanding p-value misuse. , 1989, Statistics in medicine.

[14]  J. W. Seampan,et al.  Statisticae dogmaticae : a critical essay on statistical practice in ecology , 1990 .

[15]  R. A. Groeneveld,et al.  Practical Nonparametric Statistics (2nd ed). , 1981 .

[16]  L E Moses,et al.  Parametric and nonparametric analyses of the same data. , 1989, Social science & medicine.

[17]  W. Cleveland,et al.  Locally Weighted Regression: An Approach to Regression Analysis by Local Fitting , 1988 .

[18]  W. J. Conover,et al.  Practical Nonparametric Statistics , 1972 .

[19]  Douglas A. Wolfe,et al.  Nonparametric Statistical Methods , 1973 .

[20]  K McPherson,et al.  Doctors' ignorance of statistics. , 1987, British medical journal.

[21]  J. Gastwirth Non-parametric Statistical Methods , 1990 .

[22]  James F. Quinn,et al.  On Hypothesis Testing in Ecology and Evolution , 1983, The American Naturalist.

[23]  K J Rothman,et al.  A show of confidence. , 1978, The New England journal of medicine.