Clinical prediction models

Clinical prediction models (also known as prognostic models, risk scores) are mathematical equations that relate multiple predictors (risk factors, co-variates) to the probability of having a disease or condition (diagnostic) or the probability that an event will happen in the future (prognostic)1. In the field of surgery many models have been developed that predict outcome (such as mortality) following surgery. Well known prediction models include EuroSCORE II, Portsmouth POSSUM and the American College of Surgeons National Surgical Quality Improvement Program surgical risk calculator2. Predictions from these models can be used for planning lifestyle changes, guiding therapeutic decisions, stratifying participants in randomized controlled trials or risk adjustment for comparing hospital or surgeon performance. Although prediction models can be used to gain insight into causality of the outcome of interest, this is not the aim or indeed a requirement. Furthermore, not all predictors are causal but all causal factors can be a predictor. There are clear steps to follow when introducing a new prediction model: model development, model validation and model impact3,4. Model development involves selecting predictors and combining them into a multivariable model, typically using logistic or Cox regression. An important consideration at this stage is to carefully choose what predictors are examined, and limit the number of predictors relative to the sample size to avoid overfitting. Overfitting will reduce the predictive ability of the model when applied in new data. Model validation can be separated broadly into internal and external validation. Internal validation, which is part of the model development process, will quantify any optimism as a result of overfitting, using methods such as bootstrapping or cross-validation. External validation will evaluate model transportability on a different data set (from which the model was developed). Both types of validation involve evaluating the performance of the model in terms of discrimination, calibration and sometimes clinical usefulness5. The final stage is model impact, by demonstrating whether using the model improves decision-making6.