A Tutorial on Evaluating the Time-Varying Discrimination Accuracy of Survival Models Used in Dynamic Decision Making

Many medical decisions involve the use of dynamic information collected on individual patients toward predicting likely transitions in their future health status. If accurate predictions are developed, then a prognostic model can identify patients at greatest risk for future adverse events and may be used clinically to define populations appropriate for targeted intervention. In practice, a prognostic model is often used to guide decisions at multiple time points over the course of disease, and classification performance (i.e., sensitivity and specificity) for distinguishing high-risk v. low-risk individuals may vary over time as an individual’s disease status and prognostic information change. In this tutorial, we detail contemporary statistical methods that can characterize the time-varying accuracy of prognostic survival models when used for dynamic decision making. Although statistical methods for evaluating prognostic models with simple binary outcomes are well established, methods appropriate for survival outcomes are less well known and require time-dependent extensions of sensitivity and specificity to fully characterize longitudinal biomarkers or models. The methods we review are particularly important in that they allow for appropriate handling of censored outcomes commonly encountered with event time data. We highlight the importance of determining whether clinical interest is in predicting cumulative (or prevalent) cases over a fixed future time interval v. predicting incident cases over a range of follow-up times and whether patient information is static or updated over time. We discuss implementation of time-dependent receiver operating characteristic approaches using relevant R statistical software packages. The statistical summaries are illustrated using a liver prognostic model to guide transplantation in primary biliary cirrhosis.

[1]  L. V. van't Veer,et al.  Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. , 2006, Journal of the National Cancer Institute.

[2]  T R Fleming,et al.  Trial of penicillamine in advanced primary biliary cirrhosis. , 1985, The New England journal of medicine.

[3]  Jian Huang,et al.  Regularized ROC method for disease classification and biomarker selection with microarray data , 2005, Bioinform..

[4]  J. Boyer,et al.  The prognostic importance of clinical and histologic features in asymptomatic and symptomatic primary biliary cirrhosis. , 1983, The New England journal of medicine.

[5]  D. Altman,et al.  Beneficial effect of azathioprine and prediction of prognosis in primary biliary cirrhosis. Final results of an international trial. , 1985, Gastroenterology.

[6]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data: Kalbfleisch/The Statistical , 2002 .

[7]  Paulo J. G. Lisboa,et al.  A Bayesian neural network approach for modelling censored data with an application to prognosis after surgery for breast cancer , 2003, Artif. Intell. Medicine.

[8]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[9]  P. Grambsch,et al.  Prognosis in primary biliary cirrhosis: Model for decision making , 1989, Hepatology.

[10]  Matthias Schmid,et al.  Boosting the Concordance Index for Survival Data – A Unified Framework To Derive and Evaluate Biomarker Combinations , 2013, PloS one.

[11]  Arthur E. Hoerl,et al.  Ridge Regression: Biased Estimation for Nonorthogonal Problems , 2000, Technometrics.

[12]  P. Heagerty,et al.  Non-parametric estimation of a time-dependent predictive accuracy curve. , 2013, Biostatistics.

[13]  Ronnie Driver,et al.  Biostatistics: a Methodology for the Health Sciences , 2005 .

[14]  G. Bonsel,et al.  Use of prognostic models for assessment of value of liver transplantation in primary biliary cirrhosis , 1990, The Lancet.

[15]  P. Heagerty,et al.  Time‐Dependent Predictive Accuracy in the Presence of Competing Risks , 2010, Biometrics.

[16]  John A. Swets,et al.  Evaluation of diagnostic systems : methods from signal detection theory , 1982 .

[17]  R. Tibshirani The lasso method for variable selection in the Cox model. , 1997, Statistics in medicine.

[18]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[19]  M. Pencina,et al.  Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond , 2008, Statistics in medicine.

[20]  P. Heagerty,et al.  Survival Model Predictive Accuracy and ROC Curves , 2005, Biometrics.

[21]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[22]  T. Lumley,et al.  Time‐Dependent ROC Curves for Censored Survival Data and a Diagnostic Marker , 2000, Biometrics.

[23]  K. Kowdley,et al.  Predicting outcome in primary biliary cirrhosis. , 2014, Annals of hepatology.

[24]  A. Habior,et al.  Is serum bilirubin concentration the only valid prognostic marker in primary biliary cirrhosis? , 1999, Hepatology.

[25]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[26]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[27]  F L Grover,et al.  Development of the New Lung Allocation System in the United States , 2006, American journal of transplantation : official journal of the American Society of Transplantation and the American Society of Transplant Surgeons.

[28]  M. Cheitlin The Seattle Heart Failure Model: Prediction of Survival in Heart FailureLevy WC, Mozaffarian D, Linker DT et al (Univ of Washington, Seattle; Harvard Med School; Merck Research Labs, Blue Bell, Pa; et al): Circulation 113:1424–1433, 2006§ , 2007 .

[29]  J. Trotter,et al.  Development of the allocation system for deceased donor liver transplantation. , 2005, Clinical medicine & research.

[30]  K M Leung,et al.  Censoring issues in survival analysis. , 1997, Annual review of public health.

[31]  Torsten Hothorn,et al.  Bagging survival trees , 2002, Statistics in medicine.

[32]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[33]  Laurence L. George,et al.  The Statistical Analysis of Failure Time Data , 2003, Technometrics.

[34]  P. Grambsch,et al.  Primary biliary cirrhosis: Prediction of short‐term survival based on repeated patient visits , 1994, Hepatology.

[35]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .

[36]  F. Harrell,et al.  Criteria for Evaluation of Novel Markers of Cardiovascular Risk: A Scientific Statement From the American Heart Association , 2009, Circulation.

[37]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[38]  Hemant Ishwaran,et al.  Random Survival Forests , 2008, Wiley StatsRef: Statistics Reference Online.

[39]  P. J. Verweij,et al.  Penalized likelihood in Cox regression. , 1994, Statistics in medicine.

[40]  Ashutosh Kumar Singh,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2010 .

[41]  A. Parés Primary biliary cholangitis. , 2018, Medicina clinica.

[42]  Yingye Zheng,et al.  Prospective Accuracy for Longitudinal Markers , 2007, Biometrics.

[43]  P. Heagerty,et al.  A risk‐based measure of time‐varying prognostic discrimination for survival models , 2017, Biometrics.

[44]  Hongzhe Li,et al.  Boosting proportional hazards models using smoothing splines, with applications to high-dimensional microarray data , 2005, Bioinform..

[45]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[46]  Robert Tibshirani,et al.  Survival analysis with high-dimensional covariates , 2010, Statistical methods in medical research.

[47]  Benjamin French,et al.  Development and evaluation of multi-marker risk scores for clinical prognosis , 2016, Statistical methods in medical research.

[48]  E. Schrumpf,et al.  Factors of prognostic importance in primary biliary cirrhosis. , 1990, Scandinavian journal of gastroenterology.

[49]  Ewout W Steyerberg,et al.  Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers , 2011, Statistics in medicine.

[50]  D G Altman,et al.  Updating prognosis in primary biliary cirrhosis using a time-dependent Cox regression model. PBC1 and PBC2 trial groups. , 1993, Gastroenterology.