Logistic analysis in case-control studies under validation sampling

SUMMARY This paper proposes an analysis of case-control data under a double-sampling scheme, when covariates are missing or measured with error at the first stage of sampling and are validated at the second stage in a subsample. The method combines risk information from both samples. It is derived under the assumptions that (i) the prospective disease incidence model is of logistic form, (ii) the proxy or partial information takes on finitely many values, and (iii) the error is nondifferential. The method of Prentice & Pyke (1979) is extended to a two-sample design to derive an estimating equation for the odds-ratio parameters. It provides an alternative estimator to that given by Breslow & Cain (1988). Consistency and asymptotic normality of the estimates are derived and a variance formula is presented. Parameters can be estimated by use of standard packages for quantalresponse data. The method can easily be extended to the analysis of stratified designs with large strata.

[1]  R. Prentice,et al.  Further results on covariate measurement errors in cohort studies with time to response data. , 1989, Statistics in medicine.

[2]  W. Ahrens,et al.  Occupational and environmental hazards associated with lung cancer. , 1992, International journal of epidemiology.

[3]  A. Scott,et al.  Fitting Logistic Regression Models in Stratified Case-Control Studies , 1991 .

[4]  Charles F. Manski,et al.  Estimation of Response Probabilities From Augmented Retrospective Observations , 1985 .

[5]  R. Pyke,et al.  Logistic disease incidence models and case-control studies , 1979 .

[6]  James J Schlesselman Case-Control Studies: Design, Conduct, Analysis , 1982 .

[7]  J E White,et al.  A two stage design for the study of the relationship between a rare exposure and a rare disease. , 1982, American journal of epidemiology.

[8]  N. Breslow,et al.  Statistical methods in cancer research. Vol. 1. The analysis of case-control studies. , 1981 .

[9]  N E Breslow,et al.  Logistic regression for stratified case-control studies. , 1988, Biometrics.

[10]  Norman E. Breslow,et al.  Logistic regression for two-stage case-control data , 1988 .

[11]  T. Fears,et al.  Logistic regression methods for retrospective case-control studies using complex sampling procedures. , 1986, Biometrics.

[12]  N. Breslow,et al.  Statistical methods in cancer research: volume 1- The analysis of case-control studies , 1980 .

[13]  Chris J. Wild,et al.  Fitting prospective regression models to case-control data , 1991 .

[14]  Raymond J. Carroll,et al.  On errors-in-variables for binary regression models , 1984 .

[15]  J. Nelder,et al.  The GLIM System Release 3. , 1979 .

[16]  M. Gail,et al.  Introduction. Errors‐in‐variables workshop , 1989 .

[17]  B G Armstrong,et al.  Analysis of case-control data with covariate measurement error: application to diet and colon cancer. , 1989, Statistics in medicine.

[18]  Duncan C. Thomas General relative-risk models for survival time and matched case-control analysis , 1981 .