Semiparametric inference in matched case-control studies with missing covariate data

We consider the problem of matched studies with a binary outcome that are analysed using conditional logistic regression, and for which data on some covariates are missing for some study participants. Methods for this problem involve either modelling the distribution of missing covariates or modelling the probability of data being missing. For this second approach, the previously proposed method did not make use of data for those persons with missing covariate data except in the model for the missingness. We propose a new class of estimators that use outcome and available covariate data for all study participants, and show that a particular member of this class always has better efficiency than the previously proposed estimator. We illustrate the efficiency gains that are possible with our approach using simulated data. Copyright Biometrika Trust 2002, Oxford University Press.

[1]  Norman E. Breslow,et al.  Logistic regression for two-stage case-control data , 1988 .

[2]  D. Rubin,et al.  Statistical Analysis with Missing Data. , 1989 .

[3]  R J Carroll,et al.  Conditional and Unconditional Categorical Regression Models with Missing Covariates , 2000, Biometrics.

[4]  Ralph L. Sacco,et al.  Matched case–control data analyses with missing covariates , 2000 .

[5]  N. Breslow,et al.  Statistical methods in cancer research. Vol. 1. The analysis of case-control studies. , 1981 .

[6]  V. P. Godambe Conditional likelihood and unconditional optimum estimating equations , 1976 .

[7]  S R Lipsitz,et al.  Inference using conditional logistic regression with missing covariates. , 1998, Biometrics.

[8]  L. E. Gibbons,et al.  Conditional logistic regression with missing data , 1991 .

[9]  N. Breslow,et al.  Statistical methods in cancer research: volume 1- The analysis of case-control studies , 1980 .

[10]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[11]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[12]  James M. Robins,et al.  On Profile Likelihood: Comment , 2000 .

[13]  D. Ruppert,et al.  Measurement Error in Nonlinear Models , 1995 .

[14]  L. Kupper,et al.  Inferences About Exposure-Disease Associations Using Probability-of-Exposure Information , 1993 .

[15]  B. Lindsay Using Empirical Partially Bayes Inference for Increased Efficiency , 1985 .