What do we mean by validating a prognostic model?

Prognostic models are used in medicine for investigating patient outcome in relation to patient and disease characteristics. Such models do not always work well in practice, so it is widely recommended that they need to be validated. The idea of validating a prognostic model is generally taken to mean establishing that it works satisfactorily for patients other than those from whose data it was derived. In this paper we examine what is meant by validation and review why it is necessary. We consider how to validate a model and suggest that it is desirable to consider two rather different aspects - statistical and clinical validity - and examine some general approaches to validation. We illustrate the issues using several case studies.

[1]  Willi Sauerbrei,et al.  The Use of Resampling Methods to Simplify Regression Models in Medical Statistics , 1999 .

[2]  R. Gibson,et al.  AGGRESSIVE MANAGEMENT OF SEVERE CLOSED HEAD TRAUMA: TIME FOR REAPPRAISAL , 1989, The Lancet.

[3]  S L Hui,et al.  Validation techniques for logistic regression models. , 1991, Statistics in medicine.

[4]  D. Cox A note on data-splitting for the evaluation of significance levels , 1975 .

[5]  D. G. Altman,et al.  Statistical aspects of prognostic factor studies in oncology. , 1994, British Journal of Cancer.

[6]  A. Venot,et al.  Methodological and statistical problems in the construction of composite measurement scales: a survey of six medical and epidemiological journals. , 1995, Statistics in medicine.

[7]  R M Centor,et al.  Inability to predict relapse in acute asthma. , 1984, The New England journal of medicine.

[8]  C A Rockwood,et al.  Fracture classification systems: do they work and are they useful? , 1994, The Journal of bone and joint surgery. American volume.

[9]  W J Mackillop,et al.  Measuring the accuracy of prognostic judgments in oncology. , 1997, Journal of clinical epidemiology.

[10]  M S Kramer,et al.  Causality inference in observational vs. experimental studies. An empirical comparison. , 1988, American journal of epidemiology.

[11]  B. Efron Regression and ANOVA with Zero-One Data: Measures of Residual Variation , 1978 .

[12]  Geoffrey E. Hinton,et al.  A comparison of statistical learning methods on the Gusto database. , 1998, Statistics in medicine.

[13]  D. Oliver,et al.  Development and evaluation of evidence based risk assessment tool (STRATIFY) to predict which elderly inpatients will fall: case-control and cohort studies , 1997, BMJ.

[14]  A. Laupacis,et al.  Clinical prediction rules. A review and suggested modifications of methodological standards. , 1997, JAMA.

[15]  Alvan R. Feinstein,et al.  Multivariable Analysis: An Introduction , 1996 .

[16]  Frank Davidoff,et al.  Predicting Clinical States in Individual Patients , 1996, Annals of Internal Medicine.

[17]  Ellen B. Roecker,et al.  Prediction error and its estimation for subset-selected models , 1991 .

[18]  K. S. Woo,et al.  Coronary prognostic index for the Chinese. , 1987, Australian and New Zealand journal of medicine.

[19]  J W Gamel,et al.  A comparison of prognostic covariates for uveal melanoma. , 1992, Investigative ophthalmology & visual science.

[20]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[21]  J. Concato,et al.  The Risk of Determining Risk with Multivariable Models , 1993, Annals of Internal Medicine.

[22]  L Ohno-Machado,et al.  A comparison of Cox proportional hazards and artificial neural network models for medical prognosis , 1997, Comput. Biol. Medicine.

[23]  M A Fischl,et al.  An index predicting relapse and need for hospitalization in patients with acute bronchial asthma. , 1981, The New England journal of medicine.

[24]  G. Brier VERIFICATION OF FORECASTS EXPRESSED IN TERMS OF PROBABILITY , 1950 .

[25]  P. J. Verweij,et al.  Cross-validation in survival analysis. , 1993, Statistics in medicine.

[26]  W. Sauerbrei,et al.  Dangers of using "optimal" cutpoints in the evaluation of prognostic factors. , 1994, Journal of the National Cancer Institute.

[27]  F. Harrell,et al.  Regression modelling strategies for improved prognostic prediction. , 1984, Statistics in medicine.

[28]  W Penny,et al.  Neural Networks in Clinical Medicine , 1996, Medical decision making : an international journal of the Society for Medical Decision Making.

[29]  E Graf,et al.  Assessment and comparison of prognostic classification schemes for survival data. , 1999, Statistics in medicine.

[30]  J. Concato,et al.  Importance of events per independent variable in proportional hazards regression analysis. II. Accuracy and precision of regression estimates. , 1995, Journal of clinical epidemiology.

[31]  J. Concato,et al.  A simulation study of the number of events per variable in logistic regression analysis. , 1996, Journal of clinical epidemiology.

[32]  D. Sackett,et al.  The number needed to treat: a clinically useful measure of treatment effect , 1995, BMJ.

[33]  C. Chatfield Model uncertainty, data mining and statistical inference , 1995 .

[34]  M Schumacher,et al.  Resampling and cross-validation techniques: a tool to reduce bias caused by model building? , 1997, Statistics in medicine.

[35]  G A Diamond,et al.  Future imperfect: the limitations of clinical prediction models and the limits of clinical prediction. , 1989, Journal of the American College of Cardiology.

[36]  R B D'Agostino,et al.  A comparison of logistic regression to decision-tree induction in a medical domain. , 1993, Computers and biomedical research, an international journal.

[37]  J. Wyatt,et al.  Commentary: Prognostic models: clinically useful or quickly forgotten? , 1995 .

[38]  K. S. Woo,et al.  Validation of a coronary prognostic index for the Chinese--a tale of three cities. , 1989, International journal of cardiology.

[39]  Daniel B. Mark,et al.  TUTORIAL IN BIOSTATISTICS MULTIVARIABLE PROGNOSTIC MODELS: ISSUES IN DEVELOPING MODELS, EVALUATING ASSUMPTIONS AND ADEQUACY, AND MEASURING AND REDUCING ERRORS , 1996 .

[40]  D. Streiner,et al.  Health Measurement Scales: A practical guide to thier development and use , 1989 .

[41]  R. Tibshirani,et al.  An introduction to the bootstrap , 1993 .

[42]  H C Van Houwelingen,et al.  Construction, validation and updating of a prognostic model for kidney graft survival. , 1995, Statistics in medicine.

[43]  Jesse A. Berlin,et al.  Assessing the Generalizability of Prognostic Information , 1999 .

[44]  C. Contant,et al.  Evaluation of the Leeds prognostic score for severe head injury , 1991, The Lancet.