Introduction and Summary 1. It is a commonplace of the statistical design of experiments that the hypotheses to be tested should be formulated before examining the data that are to be used to test them. Even in experimental situations, this is sometimes not possible, and in the last decade or so some progress has been made toward the development of more flexible testing procedures which allow the data to be dredged for hypotheses in certain ways. In survey analysis, which is commonly exploratory, it is rare for precise hypotheses to be formulable independently of the data. It follows that normally no precise probabilistic interpretations can validly be given to relationships found among the survey variables. In practice, this has not prevented survey practitioners from reporting probability levels as if they were precisely meaningful. Most investigators are so accustomed to making probability statements that a survey report looks naked without them, but we fear that many survey reports are wearing the Emperor's clothes. This paper offers a classification of data-dredging procedures and some comments on their use.
[1]
H. J. Larson,et al.
Biases in prediction by regression for certain incompletely specified models
,
1963
.
[2]
J. Morgan,et al.
Problems in the Analysis of Survey Data, and a Proposal
,
1963
.
[3]
D. Cox,et al.
An Analysis of Transformations
,
1964
.
[4]
H. Scheffé.
The Analysis of Variance
,
1960
.
[5]
H. J. Larson,et al.
Sequential Model Building for Prediction in Regression Analysis, I
,
1963
.
[6]
Gordon Tullock.
Publication Decisions and Tests of Significance—A Comment
,
1959
.
[7]
T. Sterling.
Publication Decisions and their Possible Effects on Inferences Drawn from Tests of Significance—or Vice Versa
,
1959
.