Assessment and comparison of prognostic classification schemes for survival data.

Prognostic classification schemes have often been used in medical applications, but rarely subjected to a rigorous examination of their adequacy. For survival data, the statistical methodology to assess such schemes consists mainly of a range of ad hoc approaches, and there is an alarming lack of commonly accepted standards in this field. We review these methods and develop measures of inaccuracy which may be calculated in a validation study in order to assess the usefulness of estimated patient-specific survival probabilities associated with a prognostic classification scheme. These measures are meaningful even when the estimated probabilities are misspecified, and asymptotically they are not affected by random censorship. In addition, they can be used to derive R(2)-type measures of explained residual variation. A breast cancer study will serve for illustration throughout the paper.

[1]  Martin Schumacher,et al.  An investigation on measures of explained variation in survival analysis , 1995 .

[2]  J. Lynn,et al.  Predicting life span for applicants to inpatient hospice. , 1988, Archives of internal medicine.

[3]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[4]  M A Hlatky,et al.  Clinical experience and predicting survival in coronary disease. , 1989, Archives of internal medicine.

[5]  J. Habbema,et al.  The Measurement of Performance in Probabilistic Diagnosis IV. Utility Considerations in Therapeutics and Prognostics , 1981, Methods of Information in Medicine.

[6]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[7]  R. Blamey,et al.  A prognostic index in primary breast cancer. , 1982, British Journal of Cancer.

[8]  A. Laupacis,et al.  Clinical prediction rules. A review and suggested modifications of methodological standards. , 1997, JAMA.

[9]  B Foxman,et al.  How Well Do Prediction Equations Predict? Using Receiver Operating Characteristic Curves and Accuracy Curves to Compare Validity and Generalizability , 1993, Epidemiology.

[10]  J H Kerr,et al.  Intensive Care Society's APACHE II study in Britain and Ireland--II: Outcome comparisons of intensive care units after adjustment for case mix by the American APACHE II method. , 1993, BMJ.

[11]  D. Cox Regression Models and Life-Tables , 1972 .

[12]  R Henderson,et al.  Problems and prediction in survival-data analysis. , 1995, Statistics in medicine.

[13]  J D Habbema,et al.  The performance of logistic discrimination on myocardial infarction data, in comparison with some other discriminant analysis methods. , 1983, Statistics in medicine.

[14]  F. Harrell,et al.  Predicting outcome in coronary disease. Statistical models versus expert clinicians. , 1986, The American journal of medicine.

[15]  M Schumacher,et al.  Resampling and cross-validation techniques: a tool to reduce bias caused by model building? , 1997, Statistics in medicine.

[16]  R. L. Winkler The Quantification of Judgment: Some Methodological Suggestions , 1967 .

[17]  D. G. Altman,et al.  Statistical aspects of prognostic factor studies in oncology. , 1994, British Journal of Cancer.

[18]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[19]  W G Henderson,et al.  Assessment of predictive models for binary outcomes: an empirical approach using operative death from cardiac surgery. , 1994, Statistics in medicine.

[20]  D. Faraggi,et al.  Bayesian Neural Network Models for Censored Data , 1997 .

[21]  R. Detrano,et al.  Accuracy curves: an alternative graphical representation of probability data. , 1989, Journal of clinical epidemiology.

[22]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[23]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[24]  P Boracchi,et al.  Clinical significance of cyclin D1 expression in patients with node-positive breast carcinoma treated with adjuvant therapy. , 1996, Annals of oncology : official journal of the European Society for Medical Oncology.

[25]  D C Hadorn,et al.  Cross-validation performance of mortality prediction models. , 1992, Statistics in medicine.

[26]  P. Armitage,et al.  Design and analysis of randomized clinical trials requiring prolonged observation of each patient. I. Introduction and design. , 1976, British Journal of Cancer.

[27]  J. C. van Houwelingen,et al.  Predictive value of statistical models , 1990 .

[28]  J. Habbema,et al.  The measurement of performance in probabilistic diagnosis. III. Methods based on continuous functions of the diagnostic probabilities. , 1978, Methods of information in medicine.

[29]  H. Burke,et al.  Artificial neural networks for cancer research: outcome prediction. , 1994, Seminars in surgical oncology.

[30]  Richard Simon,et al.  Explained Residual Variation, Explained Risk, and Goodness of Fit , 1991 .

[31]  L. Bottaci,et al.  Artificial neural networks applied to outcome prediction for colorectal cancer patients in separate institutions , 1997, The Lancet.

[32]  Parkes Cm,et al.  Accuracy of predictions of survival in later stages of cancer. , 1972 .

[33]  M. Maltoni,et al.  Prediction of survival of patients terminally III with cancer. Results of an Italian prospective multicentric study , 1995, Cancer.

[34]  A. H. Murphy,et al.  “Good” Probability Assessors , 1968 .

[35]  K. Linnet,et al.  Assessing diagnostic tests by a strictly proper scoring rule. , 1989, Statistics in medicine.

[36]  G. Brier,et al.  External correspondence: Decompositions of the mean probability score , 1982 .

[37]  Michael Schemper,et al.  The explained variation in proportional hazards regression , 1990 .

[38]  J Col,et al.  Predictors of 30-day mortality in the era of reperfusion for acute myocardial infarction. Results from an international trial of 41,021 patients. GUSTO-I Investigators. , 1995, Circulation.

[39]  J Stare,et al.  Explained variation in survival analysis. , 1996, Statistics in medicine.

[40]  Daniel B. Mark,et al.  TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS , 1996 .

[41]  D. McClish,et al.  How Well Can Physicians Estimate Mortality in a Medical Intensive Care Unit? , 1989, Medical decision making : an international journal of the Society for Medical Decision Making.

[42]  Michael Schemper,et al.  Further results on the explained variation in proportional hazards regression , 1992 .

[43]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[44]  E. Kaplan,et al.  Nonparametric Estimation from Incomplete Observations , 1958 .

[45]  R Simon,et al.  Measures of explained variation for survival data. , 1990, Statistics in medicine.

[46]  D J Spiegelhalter,et al.  Probabilistic prediction in patient management and clinical trials. , 1986, Statistics in medicine.

[47]  W J Mackillop,et al.  Measuring the accuracy of prognostic judgments in oncology. , 1997, Journal of clinical epidemiology.

[48]  J D Habbema,et al.  Use of posterior probabilities to evaluate methods of discriminant analysis. , 1981, Methods of information in medicine.

[49]  J. Wyatt,et al.  Commentary: Prognostic models: clinically useful or quickly forgotten? , 1995 .