Synthetic retrospective studies and related topics.

SUMMARY Prospective and retrospective approaches for estimating the influence of several variables on the occurrence of disease are discussed. The assumptions under which these approaches would tend to yield the same estimates as would be given by an ideal but unattainable experimental design approach are stated. It is then brought out that in a large prospective study in which comparatively few cases of disease have occurred, computational problems can be so burdensome as to preclude a comprehensive and imaginative analysis of the data. The prospective study can be converted into a synthetic retrospective study by selecting a random sample of the cases and a random sample of the noncases, the sampling proportion being small for noncases, but essentially unity for cases. It is demonstrated that such sampling will tend to leave the dependence of the log odds on the variables unaffected except for an additive constant. The use of a discrimination function noniterative method of analysis is noted and is indicated to be not generally appropriate. The reverse suggestion is made that normal data can be analyzed by a log-odds approach, this yielding alternative tests to those ordinarily used for comparing two or several means or mean vectors, or two or several variances or variance-covariance matrices.