Likelihood-Based Methods for Missing Covariates in the Cox Proportional Hazards Model

Problems associated with missing covariate data are well known but often ignored. We present a method for estimating the parameters in the Cox proportional hazards model when the missing data are missing at random (MAR) and censoring is noninformative. Due to the computational burden of this method, we introduce an approximation that allows us to use a weighted expectation-maximization (EM) algorithm to estimate the parameters more easily. When the missing covariates are continuous rather than categorical, we implement a Monte Carlo version of the EM algorithm along with the Gibbs sampler to obtain parameter estimates. We also give the asymptotic distribution of these estimates. The primary advantage of this method over complete case analysis is that it produces more efficient parameter estimates and corrects for bias in the MAR setting. To motivate the methodology, we present an analysis of a phase III melanoma clinical trial conducted by the Eastern Cooperative Oncology Group.

[1]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[2]  N. Breslow Covariance analysis of censored survival data. , 1974, Biometrics.

[3]  T. Louis Finding the Observed Information Matrix When Using the EM Algorithm , 1982 .

[4]  R. Prentice,et al.  Commentary on Andersen and Gill's "Cox's Regression Model for Counting Processes: A Large Sample Study" , 1982 .

[5]  Kirby L. Jackson,et al.  Log-linear analysis of censored survival data with partially observed covariates , 1989 .

[6]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[7]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[8]  J. Ibrahim Incomplete Data in Generalized Linear Models , 1990 .

[9]  Niels Keiding,et al.  Statistical Models Based on Counting Processes , 1993 .

[10]  Z. Ying,et al.  Cox Regression with Incomplete Covariate Measurements , 1993 .

[11]  M. Pepe,et al.  Auxiliary covariate data in failure time regression , 1995 .

[12]  Margaret S. Pepe,et al.  A mean score method for missing and auxiliary covariate data in regression models , 1995 .

[13]  J. Kirkwood,et al.  Interferon alfa-2b adjuvant therapy of high-risk resected cutaneous melanoma: the Eastern Cooperative Oncology Group Trial EST 1684. , 1996, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[14]  J G Ibrahim,et al.  Using the EM-algorithm for survival data with incomplete categorical covariates , 1996, Lifetime data analysis.

[15]  Joseph G. Ibrahim,et al.  A conditional model for incomplete covariates in parametric regression models , 1996 .

[16]  W. Tsai,et al.  On using the Cox proportional hazards model with missing covariates , 1997 .

[17]  Myunghee Cho Paik Multiple Imputation for the Cox Proportional Hazards Model with Missing Covariates , 1997, Lifetime data analysis.

[18]  J G Ibrahim,et al.  Estimating equations with incomplete categorical covariates in the Cox model. , 1998, Biometrics.

[19]  J G Ibrahim,et al.  Monte Carlo EM for Missing Covariates in Parametric Regression Models , 1999, Biometrics.

[20]  S. MacEachern,et al.  Bayesian variable selection for proportional hazards models , 1999 .

[21]  Joseph G. Ibrahim,et al.  Missing covariates in generalized linear models when the missing data mechanism is non‐ignorable , 1999 .

[22]  R. Little,et al.  Proportional hazards regression with missing covariates , 1999 .

[23]  D.,et al.  Regression Models and Life-Tables , 2022 .