We propose a method for estimating parameters for general parametric regression models with an arbitrary number of missing covariates. We allow any pattern of missing data and assume that the missing data mechanism is ignorable throughout. When the missing covariates are categorical, a useful technique for obtaining parameter estimates is the EM algorithm by the method of weights proposed in Ibrahim (1990, Journal of the American Statistical Association 85, 765-769). We extend this method to continuous or mixed categorical and continuous covariates, and for arbitrary parametric regression models, by adapting a Monte Carlo version of the EM algorithm as discussed by Wei and Tanner (1990, Journal of the American Statistical Association 85, 699-704). In addition, we discuss the Gibbs sampler for sampling from the conditional distribution of the missing covariates given the observed data and show that the appropriate complete conditionals are log-concave. The log-concavity property of the conditional distributions will facilitate a straightforward implementation of the Gibbs sampler via the adaptive rejection algorithm of Gilks and Wild (1992, Applied Statistics 41, 337-348). We assume the model for the response given the covariates is an arbitrary parametric regression model, such as a generalized linear model, a parametric survival model, or a nonlinear model. We model the marginal distribution of the covariates as a product of one-dimensional conditional distributions. This allows us a great deal of flexibility in modeling the distribution of the covariates and reduces the number of nuisance parameters that are introduced in the E-step. We present examples involving both simulated and real data.
[1]
J G Ibrahim,et al.
Parameter estimation from incomplete data in binomial regression when the missing data mechanism is nonignorable.
,
1996,
Biometrics.
[2]
T. Smith,et al.
A Randomized Phase II Study of Acivicin and 4'Deoxydoxorubicin in Patients with Hepatocellular Carcinoma in an Eastern Cooperative Oncology Group Study
,
1990,
American journal of clinical oncology.
[3]
D. Rubin.
INFERENCE AND MISSING DATA
,
1975
.
[4]
J G Ibrahim,et al.
Using the EM-algorithm for survival data with incomplete categorical covariates
,
1996,
Lifetime data analysis.
[5]
Joseph G. Ibrahim,et al.
A conditional model for incomplete covariates in parametric regression models
,
1996
.
[6]
S. Lipsitz,et al.
Hepatocellular Carcinoma: An ECOG Randomized Phase II Study of Beta‐Interferon and Menagoril
,
1995,
American journal of clinical oncology.
[7]
J. Ibrahim.
Incomplete Data in Generalized Linear Models
,
1990
.
[8]
G. C. Wei,et al.
A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms
,
1990
.
[9]
W. Gilks,et al.
Adaptive Rejection Sampling for Gibbs Sampling
,
1992
.
[10]
R. Little,et al.
Maximum likelihood estimation for mixed continuous and categorical data with missing values
,
1985
.