Bayesian variable selection in the accelerated failure time model with an application to the surveillance, epidemiology, and end results breast cancer data

Accelerated failure time model is a popular model to analyze censored time-to-event data. Analysis of this model without assuming any parametric distribution for the model error is challenging, and the model complexity is enhanced in the presence of large number of covariates. We developed a nonparametric Bayesian method for regularized estimation of the regression parameters in a flexible accelerated failure time model. The novelties of our method lie in modeling the error distribution of the accelerated failure time nonparametrically, modeling the variance as a function of the mean, and adopting a variable selection technique in modeling the mean. The proposed method allowed for identifying a set of important regression parameters, estimating survival probabilities, and constructing credible intervals of the survival probabilities. We evaluated operating characteristics of the proposed method via simulation studies. Finally, we apply our new comprehensive method to analyze the motivating breast cancer data from the Surveillance, Epidemiology, and End Results Program, and estimate the five-year survival probabilities for women included in the Surveillance, Epidemiology, and End Results database who were diagnosed with breast cancer between 1990 and 2000.

[1]  Kyu Ha Lee Bayesian Variable Selection in Parametric and Semiparametric High Dimensional Survival Analysis , 2011 .

[2]  Lancelot F. James,et al.  Gibbs Sampling Methods for Stick-Breaking Priors , 2001 .

[3]  Göran Kauermann,et al.  Functional variance estimation using penalized splines with principal component analysis , 2011, Stat. Comput..

[4]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[5]  S. E. Hills,et al.  Illustration of Bayesian Inference in Normal Data Models Using Gibbs Sampling , 1990 .

[6]  Jeng-Min Chiou,et al.  Estimated estimating equations: semiparametric inference for clustered and longitudinal data , 2005 .

[7]  Gareth O. Roberts,et al.  Examples of Adaptive MCMC , 2009 .

[8]  Ronald Christensen,et al.  Modelling accelerated failure time with a Dirichlet process , 1988 .

[9]  Lynn Kuo,et al.  Bayesian semiparametric inference for the accelerated failure‐time model , 1997 .

[10]  E. Feuer,et al.  Cancer survival among adults: US SEER Program, 1988-2001: patient and tumor characteristics. , 2007 .

[11]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[12]  Jian Huang,et al.  Regularized Estimation in the Accelerated Failure Time Model with High‐Dimensional Covariates , 2006, Biometrics.

[13]  I. James,et al.  Linear regression with censored data , 1979 .

[14]  Wenjiang J. Fu,et al.  Asymptotics for lasso-type estimators , 2000 .

[15]  1 - 22 , 2022, Die Inschriften des Stadtgottesackers in Halle an der Saale (1550–1700).

[16]  D. Blei Bayesian Nonparametrics I , 2016 .

[17]  Cun-Hui Zhang,et al.  ORACLE INEQUALITIES FOR THE LASSO IN THE COX MODEL. , 2013, Annals of statistics.

[18]  H. Zou The Adaptive Lasso and Its Oracle Properties , 2006 .

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[21]  Marina Vannucci,et al.  Bioinformatics Original Paper Bayesian Variable Selection for the Analysis of Microarray Data with Censored Outcomes , 2022 .

[22]  B. Mallick,et al.  A Bayesian Semiparametric Accelerated Failure Time Model , 1999, Biometrics.

[23]  G. Casella,et al.  The Bayesian Lasso , 2008 .

[24]  Zhiliang Ying,et al.  Linear regression analysis of censored survival data based on rank tests , 1990 .