Estimation with correlated censored survival data with missing covariates.

Incomplete covariate data are a common occurrence in studies in which the outcome is survival time. Further, studies in the health sciences often give rise to correlated, possibly censored, survival data. With no missing covariate data, if the marginal distributions of the correlated survival times follow a given parametric model, then the estimates using the maximum likelihood estimating equations, naively treating the correlated survival times as independent, give consistent estimates of the relative risk parameters Lipsitz et al. 1994 50, 842-846. Now, suppose that some observations within a cluster have some missing covariates. We show in this paper that if one naively treats observations within a cluster as independent, that one can still use the maximum likelihood estimating equations to obtain consistent estimates of the relative risk parameters. This method requires the estimation of the parameters of the distribution of the covariates. We present results from a clinical trial Lipsitz and Ibrahim (1996b) 2, 5-14 with five covariates, four of which have some missing values. In the trial, the clusters are the hospitals in which the patients were treated.

[1]  Margaret S. Pepe,et al.  Inference for Events with Dependent Risks in Multiple Endpoint Studies , 1991 .

[2]  Nan M. Laird,et al.  Regression Analysis for Categorical Variables with Outcome Subject to Nonignorable Nonresponse , 1988 .

[3]  Jianwen Cai,et al.  Estimating equations for hazard ratio parameters based on correlated failure time data , 1995 .

[4]  G. Casella,et al.  Statistical Inference , 2003, Encyclopedia of Social Network Analysis and Mining.

[5]  Jerald F. Lawless,et al.  Statistical Models and Methods for Lifetime Data. , 1983 .

[6]  J G Ibrahim,et al.  Using the EM-algorithm for survival data with incomplete categorical covariates , 1996, Lifetime data analysis.

[7]  R. Little,et al.  Proportional hazards regression with missing covariates , 1999 .

[8]  S. Zeger,et al.  Longitudinal data analysis using generalized linear models , 1986 .

[9]  M. Gu,et al.  A stochastic approximation algorithm for maximum‐likelihood estimation with incomplete data , 1998 .

[10]  L. J. Wei,et al.  Regression analysis of multivariate incomplete failure time data by modeling marginal distributions , 1989 .

[11]  G. C. Wei,et al.  A Monte Carlo Implementation of the EM Algorithm and the Poor Man's Data Augmentation Algorithms , 1990 .

[12]  D. Rubin,et al.  Statistical Analysis with Missing Data , 1988 .

[13]  M. Pepe,et al.  Auxiliary covariate data in failure time regression , 1995 .

[14]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.

[15]  R. Prentice,et al.  Correlated binary regression with covariates specific to each binary observation. , 1988, Biometrics.

[16]  J. Ibrahim Incomplete Data in Generalized Linear Models , 1990 .

[17]  H. White Maximum Likelihood Estimation of Misspecified Models , 1982 .

[18]  T. Smith,et al.  A Randomized Phase II Study of Acivicin and 4'Deoxydoxorubicin in Patients with Hepatocellular Carcinoma in an Eastern Cooperative Oncology Group Study , 1990, American journal of clinical oncology.

[19]  S R Lipsitz,et al.  Jackknife estimators of variance for parameter estimates from estimating equations with applications to clustered survival data. , 1994, Biometrics.

[20]  Joseph G. Ibrahim,et al.  Missing covariates in generalized linear models when the missing data mechanism is non‐ignorable , 1999 .

[21]  M. Kendall Theoretical Statistics , 1956, Nature.

[22]  Joseph G. Ibrahim,et al.  A conditional model for incomplete covariates in parametric regression models , 1996 .

[23]  S. Lipsitz,et al.  Hepatocellular Carcinoma: An ECOG Randomized Phase II Study of Beta‐Interferon and Menagoril , 1995, American journal of clinical oncology.

[24]  J G Ibrahim,et al.  Monte Carlo EM for Missing Covariates in Parametric Regression Models , 1999, Biometrics.

[25]  M. Segal,et al.  Dependence Estimation for Marginal Models of Multivariate Survival Data , 1997, Lifetime data analysis.

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[27]  W. Gilks,et al.  Adaptive Rejection Sampling for Gibbs Sampling , 1992 .