Information bounds for Cox regression models with missing data

We derive information bounds for the regression parameters in Cox models when data are missing at random. These calculations are of interest for understanding the behavior of efficient estimation in case-cohort designs, a type of two-phase design often used in cohort studies. The derivations make use of key lemmas appearing in Robins, Rotnitzky and Zhao [J. Amer. Statist. Assoc. 89 (1994) 846-866] and Robins, Hsieh and Newey [J. Roy. Statist. Soc. Ser. B 57 (1995) 409-424], but in a form suited for our purposes here. We begin by summarizing the results of Robins, Rotnitzky and Zhao in a form that leads directly to the projection method which will be of use for our model of interest. We then proceed to derive new information bounds for the regression parameters of the Cox model with data Missing At Random (MAR). In the final section we exemplify our calculations with several models of interest in cohort studies, including an i.i.d. version of the classical case-cohort design of Prentice [Biometrika 73 (1986) 1-11] and Self and Prentice [Ann. Statist. 16 (1988) 64-81].

[1]  J. Robins,et al.  Recovery of Information and Adjustment for Dependent Censoring Using Surrogate Markers , 1992 .

[2]  M. Emond,et al.  Information Bounds for Regression Models with Missing Data , 2000 .

[3]  Bradley Efron,et al.  FISHER'S INFORMATION IN TERMS OF THE HAZARD RATE' , 1990 .

[4]  A. Jon Information Bounds for Regression Models with Missing Data , 2000 .

[5]  Bryan Langholz,et al.  Exposure Stratified Case-Cohort Designs , 2000, Lifetime data analysis.

[6]  R A Goldbohm,et al.  Are retinol, vitamin C, vitamin E, folate and carotenoids intake associated with bladder cancer risk? Results from the Netherlands Cohort Study , 2001, British Journal of Cancer.

[7]  Jian Huang Efficient estimation of the partly linear additive Cox model , 1999 .

[8]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[9]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[10]  R. Kanwal Linear Integral Equations , 1925, Nature.

[11]  J. Robins,et al.  Estimation of Regression Coefficients When Some Regressors are not Always Observed , 1994 .

[12]  Jon A. Wellner,et al.  Empirical Processes with Applications to Statistics. , 1988 .

[13]  J H Eckfeldt,et al.  A prospective study of coronary heart disease and the hemochromatosis gene (HFE) C282Y mutation: the Atherosclerosis Risk in Communities (ARIC) study. , 2001, Atherosclerosis.

[14]  B. Efron The Efficiency of Cox's Likelihood Function for Censored Data , 1977 .

[15]  P. Sasieni,et al.  Information Bounds for the Conditional Hazard Ratio in a Nested Family of Regression Models , 1992 .

[16]  A. Miller,et al.  Dietary intake of folic acid and colorectal cancer risk in a cohort of women , 2002, International journal of cancer.

[17]  Kani Chen,et al.  Case-cohort and case-control analysis with Cox's model , 1999 .

[18]  J. Wellner,et al.  Empirical Processes with Applications to Statistics , 2009 .

[19]  W. J. Hall,et al.  Information and Asymptotic Efficiency in Parametric-Nonparametric Models , 1983 .

[20]  D. Margolis,et al.  Hormone replacement therapy and prevention of pressure ulcers and venous leg ulcers , 2002, The Lancet.

[21]  Yi-Hau Chen,et al.  A Pseudoscore Estimator for Regression Problems With Two-Phase Sampling , 2003 .

[22]  P. Sasieni,et al.  Non-orthogonal projections and their application to calculating the information in a partly linear Cox model , 1992 .

[23]  James M. Robins,et al.  Semiparametric efficient estimation of a conditional density with missing or mismeasured covariates , 1995 .

[24]  D. Horvitz,et al.  A Generalization of Sampling Without Replacement from a Finite Universe , 1952 .

[25]  R. L. Prentice,et al.  A case-cohort design for epidemiologic cohort studies and disease prevention trials , 1986 .

[26]  R. Prentice,et al.  Commentary on Andersen and Gill's "Cox's Regression Model for Counting Processes: A Large Sample Study" , 1982 .

[27]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[28]  N. Breslow,et al.  High telomerase reverse transcriptase (hTERT) messenger RNA level correlates with tumor recurrence in patients with favorable histology Wilms' tumor. , 1999, Cancer research.

[29]  I. Hertz-Picciotto,et al.  Case-cohort analysis of agricultural pesticide applications near maternal residence and selected causes of fetal death. , 2001, American journal of epidemiology.

[30]  Steven G. Self,et al.  Asymptotic Distribution Theory and Efficiency Results for Case-Cohort Studies , 1988 .

[31]  E W Gunter,et al.  Prospective study of serum selenium levels and incident esophageal and gastric cancers. , 2000, Journal of the National Cancer Institute.

[32]  R. Goldbohm,et al.  Occupational risk factors for male bladder cancer: results from a population based case cohort study in the Netherlands , 2001, Occupational and environmental medicine.