Some good reasons to ban the use of NOEC, LOEC and related concepts in ecotoxicology

I have no idea who was the first to invent the concepts of No Observed Effect Concentration (NOEC) and Lowest Observed Effect Concentration (LOEC) in ecotoxicology. But I do believe that the introduction of these terms was the most serious misfortune to happen to ecotoxicology. Probably, the whole idea originates from the early (and still very useful) toxicological experiments where the effects of various toxicants were compared, rather than different doses of the same chemical. Obviously in these situations, the best way to describe various chemical substances in terms of the magnitude of their toxicity is by testing the hypothesis of no difference between chemicals, frequently supplemented by the more general one of no significant effect at all. There are several methods for testing this kind of hypothesis described in every statistical textbook. The most widely used in toxicological studies are probably the Student t-test and the Analysis of Variance for multiple comparisons. In general, of course, this approach is fully justified, with the exception of serious violations of the basic assumptions, i.e. the type of the frequency distribution of the variable and homogeneity of variances across factors. As shown by some recent Monte Carlo simulations, even these violations do not significantly affect the results of analysis of variance until the distribution deviate strongly from normal and the sample is small (CSS: Statistica 1991). When a significant overall effect is detected, usually some a posteriori tests are employed for separating means of particular treatments. The eventual outcome of this hypothesis testing approach is estimates of significant difference intervals. These are calculated with the preset cc-level (type I error), which in biological and environmental sciences is customarily set at 0.05. The rather trivial passage above is essential for understanding why the hypothesis testing approach is fully justified when comparing effects of various chemicals but, on the other hand, using it for establishing NOEC or LOEC levels is just a misapplication. Having once calculated the significant difference intervals (and assuming that they do not overlap) we can now state that if the effect of chemical A differs from that of chemical B in our experiment there is at most a 0.05 probability that they do so by pure chance. Or the opposite: if the calculated intervals for the chemicals A and B do overlap we can say that the probability that the effects of the chemicals A and B differ in our experiment by pure chance alone is greater than 0.05. Let A be the control and B the tested chemical. Having the results as in the latter example, can we say that there is no effect of the chemical B at all? Certainly not! All we can say is that the difference like that obtained in our test can be relatively easily found by chance due to a sampling error. So, we have found no evidence strong enough to reject the null hypothesis of no difference. One has to remember, however, that this decision is based on the rather arbitrary a-level of 0.05 and that there is also a (only exceptionally reported) 13-level (type II) error. In many studies, the type I error is probably of more concern than the type II error. If we study e.g. political attitudes among Catholics and Protestants one would apparently be very cautious about too easily drawing the conclusion that the two populations really differ in this respect. But with testing potentially harmful chemicals there is a different story. We should be extremely cautious about overlooking the effect if there is one! And this is just the type II error. I am convinced that, unfortunately, in many ecotoxicological studies the type II error was not estimated at all. It is quite frequent for the 13-level to exceed 10, or even 20%.