A COMMENT ON EINHORN'S “ALCHEMY IN THE BEHAVIORAL SCIENCES”

Einhorn has presented some interesting results applying dataanalysis programs to random data, but he comes to some wrong conclusions. The dichotomy he raises between mindless ransacking and precise testing of a theory is a false one. In the real world no one does either. It would be absurd to take any set of data of any richness and test a single model with it. But once one fits more than one model to the data-e.g. runs more than one regression-the whole logical structure of significance tests topples. Our problem is not insufficient theory, but too many competing theories. We are choosing among a number of models, each probably misspecified. What then is an appropriate strategy? Clearly, significance tests are inappropriate. Indeed, no such tests are produced by any version of the AID program developed by the Survey Research Center. They are equally inappropriate when one runs a whole series of regressions on a set of data. Conducting significance tests on the results of an AID analysis is totally illegitimate, and irrelevant. If we are then selecting among competing hypotheses, usually a substantial number of them, theory operates at a different levelin the range of possibilities considered and the extent of imposition of assumptions for statistical convenience (additivity, linearity). We argue that what is important in the inevitable ransacking is some prestated strategy so that the process is reproducible. This is what AID does: it makes visible the strategy. What is also required if the results are to be carried very far is some assessment of the power of the model finally selected in explaining the world from which the sample was drawn. Einhorn does correctly stress the need for replicating the results. This requires refitting that model to a fresh (independent) sample. This is simple to say, but complex to put into practice for the following reasons: In any complex (clustered, multistage) sample, dividing it into