A two stage design for the study of the relationship between a rare exposure and a rare disease.

Studies of the relationship between a rare disease and a rare exposure to a risk factor require a very large sample size to obtain reasonable estimates of risk. The cost of such studies is often prohibitive. This paper presents a less costly, two stage approach. Disease and exposure status are ascertained on a large sample in the first stage, but covariate data are collected on only a subsample in the second stage. This subsample is chosen by separately sampling from the four groups based on disease and exposure status (the diseased and exposed, the diseased and unexposed, etc.). The efficiency of this design is achieved by sampling a large proportion (or all) of the subjects from the small groups and a smaller proportion of those from the large groups. An example of a method of analyzing data from this study design, based on weighted least squares techniques, is given. Application of this new design to studies not involving a rare disease and rare exposure are discussed.