On the C‐statistics for evaluating overall adequacy of risk prediction procedures with censored survival data

For modern evidence‐based medicine, a well thought‐out risk scoring system for predicting the occurrence of a clinical event plays an important role in selecting prevention and treatment strategies. Such an index system is often established based on the subject's ‘baseline’ genetic or clinical markers via a working parametric or semi‐parametric model. To evaluate the adequacy of such a system, C‐statistics are routinely used in the medical literature to quantify the capacity of the estimated risk score in discriminating among subjects with different event times. The C‐statistic provides a global assessment of a fitted survival model for the continuous event time rather than focussing on the prediction of bit‐year survival for a fixed time. When the event time is possibly censored, however, the population parameters corresponding to the commonly used C‐statistics may depend on the study‐specific censoring distribution. In this article, we present a simple C‐statistic without this shortcoming. The new procedure consistently estimates a conventional concordance measure which is free of censoring. We provide a large sample approximation to the distribution of this estimator for making inferences about the concordance measure. Results from numerical studies suggest that the new procedure performs well in finite sample. Copyright © 2011 John Wiley & Sons, Ltd.

[1]  Lee-Jen Wei,et al.  Evaluating subject-level incremental values of new markers for risk classification rule , 2013, Lifetime data analysis.

[2]  T Hielscher,et al.  On the prognostic value of survival models with application to gene expression signatures , 2010, Statistics in medicine.

[3]  Pierre I Karakiewicz,et al.  An updated catalog of prostate cancer predictive tools , 2008, Cancer.

[4]  M. Pencina,et al.  General Cardiovascular Risk Profile for Use in Primary Care: The Framingham Heart Study , 2008, Circulation.

[5]  M. Pencina,et al.  Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond , 2008, Statistics in medicine.

[6]  D. Levy,et al.  A Risk Score for Predicting Near-Term Incidence of Hypertension: The Framingham Heart Study , 2008, Annals of Internal Medicine.

[7]  Guoqing Diao,et al.  Estimation of time‐dependent area under the ROC curve for long‐term risk prediction , 2006, Statistics in medicine.

[8]  Margaret Sullivan Pepe,et al.  The sensitivity and specificity of markers for event times. , 2005, Biostatistics.

[9]  M. Gonen,et al.  Concordance probability and discriminatory power in proportional hazards regression , 2005 .

[10]  Howard Y. Chang,et al.  Robustness, scalability, and integration of a wound-response gene expression signature in predicting breast cancer survival. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[11]  P. Heagerty,et al.  Survival Model Predictive Accuracy and ROC Curves , 2005, Biometrics.

[12]  M. Pencina,et al.  Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation , 2004, Statistics in medicine.

[13]  Laurence L. George,et al.  The Statistical Analysis of Failure Time Data , 2003, Technometrics.

[14]  O. Borgan The Statistical Analysis of Failure Time Data (2nd Ed.). John D. Kalbfleisch and Ross L. Prentice , 2003 .

[15]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data: Kalbfleisch/The Statistical , 2002 .

[16]  Yudong D. He,et al.  Gene expression profiling predicts clinical outcome of breast cancer , 2002, Nature.

[17]  Van,et al.  A gene-expression signature as a predictor of survival in breast cancer. , 2002, The New England journal of medicine.

[18]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[19]  Z. Ying,et al.  Analysis of transformation models with censored data , 1995 .

[20]  Lee-Jen Wei,et al.  Confidence bands for survival curves under the proportional , 1994 .

[21]  Z. Ying,et al.  Checking the Cox model with cumulative sums of martingale-based residuals , 1993 .

[22]  Nils Lid Hjort,et al.  On inference in parametric survival data models , 1992 .

[23]  K. Anderson,et al.  Cardiovascular disease risk profiles. , 1991, American heart journal.

[24]  R Simon,et al.  Measures of explained variation for survival data. , 1990, Statistics in medicine.

[25]  Deborah Nolan,et al.  Functional Limit Theorems for $U$-Processes , 1988 .

[26]  D. Pollard,et al.  $U$-Processes: Rates of Convergence , 1987 .

[27]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[28]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[29]  J. Hanley,et al.  The meaning and use of the area under a receiver operating characteristic (ROC) curve. , 1982, Radiology.

[30]  Walter R. Young,et al.  The Statistical Analysis of Failure Time Data , 1981 .

[31]  J. Kalbfleisch,et al.  The Statistical Analysis of Failure Time Data , 1980 .

[32]  D. Bamber The area above the ordinal dominance graph and the area below the receiver operating characteristic graph , 1975 .

[33]  Myles Hollander,et al.  Nonparametric Tests of Independence for Censored Data with Application to Heart Transplant Studies , 1973 .

[34]  D. Cox Regression Models and Life-Tables , 1972 .

[35]  David R. Cox,et al.  Regression models and life tables (with discussion , 1972 .