Backward multiple imputation estimation of the conditional lifetime expectancy function with application to censored human longevity data

Significance The expected lifetime of a subject given survival past a certain time, denoted as lifetime expectancy, is often estimated in the data context with right censoring, which is a form of missing data problem commonly arising in biomedical applications, e.g., clinical trials, meaning that time-to-event is observed only if it occurs prior to some prespecified time. We report an advanced and more flexible method where users are free to choose a base model to estimate lifetime expectancy by imputing the right-censored times in backward order to address data “missingness.” We use this innovative tool to explore the interesting and important issue of human aging providing individual attributes including gender, smoking, body mass index, socioeconomic status, and diseases. The conditional lifetime expectancy function (LEF) is the expected lifetime of a subject given survival past a certain time point and the values of a set of explanatory variables. This function is attractive to researchers because it summarizes the entire residual life distribution and has an easy interpretation compared with the popularly used hazard function. In this paper, we propose a general framework of backward multiple imputation for estimating the conditional LEF and the variance of the estimator in the right-censoring setting. Simulation studies are conducted to investigate the empirical properties of the proposed estimator and the corresponding variance estimator. We demonstrate the method on the Beaver Dam Eye Study data, where the expected human lifetime is modeled with smoothing-spline ANOVA given the covariates information including sex, lifestyle factors, and disease variables.

[1]  Lena Osterhagen,et al.  Multiple Imputation For Nonresponse In Surveys , 2016 .

[2]  Jing Kong,et al.  Using distance correlation and SS-ANOVA to assess associations of familial relationships, lifestyle factors, diseases, and mortality , 2012, Proceedings of the National Academy of Sciences.

[3]  Yuedong Wang,et al.  Smoothing Splines: Methods and Applications , 2011 .

[4]  Yuedong Wang Smoothing Spline ANOVA , 2011 .

[5]  S. Ghosh,et al.  Nonparametric estimation of the conditional mean residual life function with censored data , 2011, Lifetime data analysis.

[6]  David J. Edwards,et al.  Mean Residual Life , 2011, International Encyclopedia of Statistical Science.

[7]  Liuquan Sun,et al.  A Class of Transformed Mean Residual Life Models With Censored Survival Data , 2009, Journal of the American Statistical Association.

[8]  Su-Chun Cheng,et al.  Linear life expectancy regression with censored data , 2006 .

[9]  Stephen J. Wright,et al.  Framework for kernel regularization with application to protein clustering. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[10]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[11]  Chong Gu Smoothing Spline Anova Models , 2002 .

[12]  Wenceslao González-Manteiga,et al.  Asymptotic properties of a generalized kaplan-meier estimator with some applications , 1994 .

[13]  R. Klein,et al.  The Beaver Dam Eye Study: visual acuity. , 1991, Ophthalmology.

[14]  Tamraparni Dasu,et al.  A note on residual life , 1990 .

[15]  G. Wahba Spline Models for Observational Data , 1990 .

[16]  Dorota M. Dabrowska,et al.  Uniform Consistency of the Kernel Conditional Kaplan-Meier Estimate , 1989 .

[17]  I. James,et al.  Linear regression with censored data , 1979 .

[18]  B. Efron The two sample problem with censored data , 1967 .

[19]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[20]  D.,et al.  Regression Models and Life-Tables , 2022 .