Comments: A Schema for Proper Utilization of Multiple Comparisons in Research and a Case Study1

The recently published study by Rothkopf (1966) illustrates a common, albeit undesirable, phenomenon in behavioral research: The design and analysis are not adequately congruent. Rothkopf's study reflects unusual care and insight into the matter of design and internal validity (with one major deficiency; he does not indicate that the subjects were randomly assigned to treatments, although there is good reason to expect that this was the case). The authors are in complete sympathy with the position that design is of primary importance, and that statistical analysis is secondary, that statistics should be the researcher's slave, not his master. Such a position, however, is not equivalent to, nor an excuse for, selecting inappropriate statistical techniques and models which lead to crude probability estimates of type-I errors, these being, of course, the basis for inference. Rothkopf's study examines the effects of experimental test-like questions (EQs) on a general achievement test (GT) for various reading selections varying the temporal position of EQs [before (B) and after (A)], with and without given answers. He also compared the effects of interspersed EQs versus EQs given in a block prior to reading. He also employed two control groups (no EQs), one of which was given no specialized instructions (C) while the other was directed to read carefully and slowly (called Direction Reference Group DRG by Rothkopf). Rothkopf's (1966) seven-level one-way ANOVA design is illustrated below, showing the combinations of the above-mentioned variables which he included. EQs No EQs (controls)