Flexible maximum likelihood methods for assessing joint effects in case-control studies with complex sampling.

Case-control studies can often be made more efficient by using frequency matching, randomized recruitment, stratified sampling, or two-stage sampling. These designs share two common features: (1) some "first-stage" variables are ascertained for all study subjects, while complete variable ascertainment is carried out for only a selected subsample, and (2) the subsampling of subjects for "second-stage" variable ascertainment depends jointly on their disease status and their observed first-stage variables. Because first-stage variables alter the subsampling fractions, standard analyses require a multiplicative specification of any joint effects of a second- and a first-stage variable. We show that by making use of missing data methods, maximum likelihood estimates can be obtained for risk parameters of interest, even those characterizing interactions between first- and second-stage variables. Joint effects can thus be modelled flexibly, with allowance for both additive and multiplicative models. Preliminary data from a case-control study of lung cancer as related to age, sex, and smoking provide an example, leading to the suggestion that the combined effect of age and smoking is multiplicative.

[1]  N. Breslow Design and analysis of case-control studies. , 1982, Annual review of public health.

[2]  C R Weinberg,et al.  Randomized recruitment in case-control studies. , 1991, American journal of epidemiology.

[3]  J K McLaughlin,et al.  Selection of controls in case-control studies. II. Types of controls. , 1992, American journal of epidemiology.

[4]  Charles F. Manski,et al.  Estimation of Response Probabilities From Augmented Retrospective Observations , 1985 .

[5]  A. M. Walker,et al.  Anamorphic analysis: sampling and estimation for covariate effects when both exposure and disease are known. , 1982, Biometrics.

[6]  N. Breslow,et al.  Statistical methods in cancer research: volume 1- The analysis of case-control studies , 1980 .

[7]  J K McLaughlin,et al.  Selection of controls in case-control studies. I. Principles. , 1992, American journal of epidemiology.

[8]  P. McCullagh,et al.  Generalized Linear Models , 1992 .

[9]  Clarice R. Weinberg,et al.  Prospective analysis of case-control data under general multiplicative-intercept risk models , 1993 .

[10]  N. Breslow,et al.  Statistical methods in cancer research. Vol. 1. The analysis of case-control studies. , 1981 .

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  J. Nelder,et al.  The GLIM System Release 3. , 1979 .

[13]  Donald B. Rubin,et al.  Max-imum Likelihood from Incomplete Data , 1972 .

[14]  Norman E. Breslow,et al.  Logistic regression for two-stage case-control data , 1988 .

[15]  T. Fears,et al.  Logistic regression methods for retrospective case-control studies using complex sampling procedures. , 1986, Biometrics.

[16]  S Wacholder,et al.  Binomial regression in GLIM: estimating risk ratios and risk differences. , 1986, American journal of epidemiology.

[17]  S Greenland,et al.  The efficiency of matching in case-control studies of risk-factor interactions. , 1985, Journal of chronic diseases.

[18]  S Greenland,et al.  Analytic methods for two-stage case-control studies and other stratified designs. , 1991, Statistics in medicine.

[19]  A. Scott,et al.  Fitting Logistic Regression Models in Stratified Case-Control Studies , 1991 .

[20]  J F Lawless,et al.  Likelihood analysis of multi-state models for disease incidence and mortality. , 1988, Statistics in medicine.

[21]  N E Breslow,et al.  Logistic regression for stratified case-control studies. , 1988, Biometrics.

[22]  J E White,et al.  A two stage design for the study of the relationship between a rare exposure and a rare disease. , 1982, American journal of epidemiology.

[23]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[24]  C R Weinberg,et al.  The design and analysis of case-control studies with biased sampling. , 1990, Biometrics.

[25]  P. McCullagh,et al.  Generalized Linear Models , 1972, Predictive Analytics.

[26]  H. Morgenstern,et al.  Epidemiologic Research: Principles and Quantitative Methods. , 1983 .

[27]  D. Thomas,et al.  Biological models and statistical interactions: an example from multistage carcinogenesis. , 1981, International journal of epidemiology.