Explained variation for recurrent event data

Although there are many suggested measures of explained variation for single-event survival data, there has been little attention to explained variation for recurrent event data. We describe an existing rank-based measure and we investigate a new statistic based on observed and expected event count processes. Both methods can be used for all models. Adjustments for missing data are proposed and demonstrated through simulation to be effective. We compare the population values of the two statistics and illustrate their use in comparing an array of non-nested models for data on recurrent episodes of infant diarrhoea.

[1]  Ying Huang,et al.  Evaluating the ROC performance of markers for future events , 2008, Lifetime data analysis.

[2]  Patrick Royston,et al.  A new measure of prognostic separation in survival data , 2004, Statistics in medicine.

[3]  R. Henderson,et al.  A Measure of Explained Variation for Event History Data , 2011, Biometrics.

[4]  Harald Binder,et al.  Quantifying the predictive accuracy of time‐to‐event models in the presence of competing risks , 2011, Biometrical journal. Biometrische Zeitschrift.

[5]  Holly Janes,et al.  Letter by Pepe et al regarding article, "Use and misuse of the receiver operating characteristic curve in risk prediction". , 2007, Circulation.

[6]  P Royston,et al.  A simulation study of predictive ability measures in a survival model II: explained randomness and predictive accuracy , 2012, Statistics in medicine.

[7]  R. Henderson,et al.  Effect of frailty on marginal regression estimates in survival analysis , 1999 .

[8]  E Graf,et al.  Quantifying the Predictive Performance of Prognostic Models for Censored Survival Data with Time‐Dependent Covariates , 2008, Biometrics.

[9]  M. Pencina,et al.  On the C‐statistics for evaluating overall adequacy of risk prediction procedures with censored survival data , 2011, Statistics in medicine.

[10]  Yingye Zheng,et al.  Prospective Accuracy for Longitudinal Markers , 2007, Biometrics.

[11]  Ziding Feng,et al.  Evaluating the Predictiveness of a Continuous Marker , 2007, Biometrics.

[12]  E Graf,et al.  Assessment and comparison of prognostic classification schemes for survival data. , 1999, Statistics in medicine.

[13]  Niels Keiding,et al.  Explained Variation and Predictive Accuracy in General Parametric Statistical Models: The Role of Model Misspecification , 2004, Lifetime data analysis.

[14]  Margaret Sullivan Pepe,et al.  Assessing risk prediction models in case–control studies using semiparametric and nonparametric methods , 2010, Statistics in medicine.

[15]  G. Maddala Limited-dependent and qualitative variables in econometrics: Introduction , 1983 .

[16]  O. Aalen A linear regression model for the analysis of life times. , 1989, Statistics in medicine.

[17]  Martin Schumacher,et al.  Measures of prediction error for survival data with longitudinal covariates , 2011, Biometrical journal. Biometrische Zeitschrift.

[18]  Patrick Royston,et al.  Explained Variation for Survival Models , 2006 .

[19]  M. Pencina,et al.  Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation , 2004, Statistics in medicine.

[20]  Tianxi Cai,et al.  Evaluating Prediction Rules for t-Year Survivors With Censored Regression Models , 2007 .

[21]  Odd O Aalen,et al.  Dynamic Analysis of Recurrent Event Data Using the Additive Hazard Model , 2006, Biometrical journal. Biometrische Zeitschrift.

[22]  M. Schumacher,et al.  Consistent Estimation of the Expected Brier Score in General Survival Models with Right‐Censored Event Times , 2006, Biometrical journal. Biometrische Zeitschrift.

[23]  R Simon,et al.  Measures of explained variation for survival data. , 1990, Statistics in medicine.

[24]  P. Schmidt,et al.  Limited-Dependent and Qualitative Variables in Econometrics. , 1984 .

[25]  Nancy R Cook,et al.  Clinically relevant measures of fit? A note of caution. , 2012, American journal of epidemiology.

[26]  M. Pencina,et al.  Interpreting incremental value of markers added to risk prediction models. , 2012, American journal of epidemiology.

[27]  N. Cook Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction , 2007, Circulation.

[28]  Michael Schemper,et al.  The explained variation in proportional hazards regression , 1990 .

[29]  N. Nagelkerke,et al.  A note on a general definition of the coefficient of determination , 1991 .

[30]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[31]  M. Gonen,et al.  Concordance probability and discriminatory power in proportional hazards regression , 2005 .

[32]  Tianxi Cai,et al.  Time-Dependent Predictive Values of Prognostic Biomarkers With Failure Time Outcome , 2008, Journal of the American Statistical Association.

[33]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[34]  John T. Kent,et al.  Measures of dependence for censored survival data , 1988 .

[35]  Aasthaa Bansal,et al.  Further insight into the incremental value of new markers: the interpretation of performance measures and the importance of clinical context. , 2012, American journal of epidemiology.

[36]  Robin Henderson,et al.  Prediction in Survival Analysis: Model or Medic? , 1996 .

[37]  Thomas A Gerds,et al.  Efron‐Type Measures of Prediction Error for Survival Analysis , 2007, Biometrics.

[38]  Guoqing Diao,et al.  Estimation of time‐dependent area under the ROC curve for long‐term risk prediction , 2006, Statistics in medicine.

[39]  J. Klein,et al.  Statistical Models Based On Counting Process , 1994 .

[40]  F. Harrell,et al.  Prognostic/Clinical Prediction Models: Multivariable Prognostic Models: Issues in Developing Models, Evaluating Assumptions and Adequacy, and Measuring and Reducing Errors , 2005 .

[41]  John O'Quigley,et al.  Explained randomness in proportional hazards models , 2005, Statistics in medicine.

[42]  M Jones,et al.  Accuracy of point predictions in survival analysis , 2001, Statistics in medicine.

[43]  Nancy R. Cook,et al.  Use and Misuse of the Receiver Operating Characteristic Curve in Risk Prediction , 2007, Circulation.

[44]  Patrick Royston,et al.  A simulation study of predictive ability measures in a survival model I: Explained variation measures , 2012, Statistics in medicine.

[45]  J. Kent Information gain and a general measure of correlation , 1983 .

[46]  M. Pencina,et al.  Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond , 2008, Statistics in medicine.

[47]  P. Heagerty,et al.  Survival Model Predictive Accuracy and ROC Curves , 2005, Biometrics.

[48]  Tianxi Cai,et al.  Semiparametric models of time-dependent predictive values of prognostic biomarkers. , 2010, Biometrics.

[49]  Robin Henderson,et al.  Dynamic Analysis of Recurrent Event Data with Missing Observations, with Application to Infant Diarrhoea in Brazil , 2007 .

[50]  Chaya S Moskowitz,et al.  Quantifying and comparing the accuracy of binary biomarkers when predicting a failure time outcome. , 2004, Statistics in medicine.

[51]  Odd O Aalen,et al.  Dynamic Analysis of Multivariate Failure Time Data , 2004, Biometrics.

[52]  R. Henderson,et al.  A serially correlated gamma frailty model for longitudinal count data , 2003 .

[53]  Martin Schumacher,et al.  An investigation on measures of explained variation in survival analysis , 1995 .

[54]  J O'Quigley,et al.  Predictive capability of proportional hazards regression. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[55]  D. Maucort-Boulch,et al.  On a measure of information gain for regression models in survival analysis , 2014 .

[56]  R Henderson,et al.  Problems and prediction in survival-data analysis. , 1995, Statistics in medicine.

[57]  Tianxi Cai,et al.  Robust Prediction of t‐Year Survival with Data from Multiple Studies , 2011, Biometrics.

[58]  J Stare,et al.  Explained variation in survival analysis. , 1996, Statistics in medicine.

[59]  A new approach to estimate correlation coefficients in the presence of censoring and proportional hazards , 1997 .

[60]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[61]  Margaret Sullivan Pepe,et al.  The sensitivity and specificity of markers for event times. , 2005, Biostatistics.

[62]  Daniel B. Mark,et al.  TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS , 1996 .

[63]  Yan Yuan,et al.  Estimation of prediction error for survival models , 2009, Statistics in medicine.

[64]  M. Schemper Predictive accuracy and explained variation , 2003, Statistics in medicine.

[65]  Robin Henderson,et al.  Identification and efficacy of longitudinal markers for survival. , 2002, Biostatistics.

[66]  S. Cairncross,et al.  Childhood diarrhoea symptoms, management and duration: observations from a longitudinal community study. , 2005, Transactions of the Royal Society of Tropical Medicine and Hygiene.

[67]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[68]  T. Lumley,et al.  Time‐Dependent ROC Curves for Censored Survival Data and a Diagnostic Marker , 2000, Biometrics.

[69]  Richard Simon,et al.  Explained Residual Variation, Explained Risk, and Goodness of Fit , 1991 .

[70]  Chaya S Moskowitz,et al.  Quantifying and comparing the predictive accuracy of continuous prognostic factors for binary outcomes. , 2004, Biostatistics.