Logistic Regression With Incomplete Covariate Data in Complex Survey Sampling: Application of Reweighted Estimating Equations

Weighted survey data with missing data for some covariates presents a substantial challenge for analysis. We addressed this problem by using a reweighting technique in a logistic regression model to estimate parameters. Each survey weight was adjusted by the inverse of the probability that the possibly missing covariate was observed. The reweighted estimating equations procedure was compared with a complete case analysis (after discarding any subjects with missing data) in a simulation study to assess bias reduction. The method was also applied to data obtained from a national health survey (National Health and Nutritional Examination Survey or NHANES). Adjusting the sampling weights by the inverse probability of being completely observed appears to be effective in accounting for missing data and reducing the bias of the complete case estimate of the regression coefficients.

[1]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[2]  Edward L. Korn,et al.  Analysis of Health Surveys , 1999 .

[3]  James M. Robins,et al.  Semiparametric Regression for Repeated Outcomes With Nonignorable Nonresponse , 1998 .

[4]  A B Troxel,et al.  Weighted estimating equations with nonignorably missing response data. , 1997, Biometrics.

[5]  B Barnwell,et al.  SUDAAN User's Manual, Release 7.5, , 1997 .

[6]  J. Robins,et al.  Toward a curse of dimensionality appropriate (CODA) asymptotic theory for semi-parametric models. , 1997, Statistics in medicine.

[7]  S. Lipsitz,et al.  Regression analysis with missing covariate data using estimating equations. , 1996, Biometrics.

[8]  G. Kalton,et al.  Handling missing data in survey research , 1996, Statistical methods in medical research.

[9]  S Greenland,et al.  A critical look at methods for handling missing covariates in epidemiologic regression analyses. , 1995, American journal of epidemiology.

[10]  J. Robins,et al.  Analysis of semiparametric regression models for repeated outcomes in the presence of missing data , 1995 .

[11]  Edward L. Korn,et al.  Analysis of Large Health Surveys: Accounting for the Sampling Design , 1995 .

[12]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[13]  J. H. Schuenemeyer,et al.  Generalized Linear Models (2nd ed.) , 1992 .

[14]  P. McCullagh,et al.  Generalized Linear Models, 2nd Edn. , 1990 .

[15]  K. Maurer,et al.  Third national health and nutrition examination survey , 1985 .

[16]  D. Binder On the variances of asymptotically normal estimators from complex surveys , 1983 .