The design and analysis of case-control studies with biased sampling.

A design is proposed for case-control studies in which selection of subjects for full variable ascertainment is based jointly on disease status and on easily obtained "screening" variables that may be related to the disease. Recruitment of subjects follows an independent Bernoulli sampling scheme, with recruitment probabilities set by the investigator in advance. In particular, the sampling can be set up to achieve, on average, frequency matching, provided prior estimates of the disease rates or odds ratios associated with screening variables such as age and sex are available. Alternatively--for example, when studying a rare exposure--one can enrich the sample with certain categories of subject. Following such a design, there are two valid approaches to logistic regression analysis, both of which allow for efficient estimation of effects associated with the screening variables that were allowed to bias the recruitment. The statistical properties of the estimators are compared, both for large samples, based on asymptotics, and for small samples, based on simulations.

[1]  M. Gail The Effect of Pooling Across Strata in Perfectly Balanced Studies , 1988 .

[2]  J E White,et al.  A two stage design for the study of the relationship between a rare exposure and a rare disease. , 1982, American journal of epidemiology.

[3]  T. Fears,et al.  Logistic regression methods for retrospective case-control studies using complex sampling procedures. , 1986, Biometrics.

[4]  N. Breslow,et al.  Estimation of multiple relative risk functions in matched case-control studies. , 1978, American journal of epidemiology.

[5]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[6]  A. M. Walker,et al.  Anamorphic analysis: sampling and estimation for covariate effects when both exposure and disease are known. , 1982, Biometrics.

[7]  K C Cain,et al.  Logistic regression analysis and efficient design for two-stage studies. , 1988, American journal of epidemiology.

[8]  S Greenland,et al.  The efficiency of matching in case-control studies of risk-factor interactions. , 1985, Journal of chronic diseases.

[9]  Norman E. Breslow,et al.  Logistic regression for two-stage case-control data , 1988 .

[10]  H. Morgenstern,et al.  Epidemiologic Research: Principles and Quantitative Methods. , 1983 .

[11]  B. Cohen Tests of the linear, no-threshold dose-response relationship for high-LET radiation. , 1987, Health physics.

[12]  R F Woolson,et al.  Sample size for case-control studies using Cochran's statistic. , 1986, Biometrics.

[13]  N E Breslow,et al.  Logistic regression for stratified case-control studies. , 1988, Biometrics.