Validating and updating a risk model for pneumonia – a case study

BackgroundThe development of risk prediction models is of increasing importance in medical research - their use in practice, however, is rare. Among other reasons this might be due to the fact that thorough validation is often lacking. This study focuses on two Bayesian approaches of how to validate a prediction rule for the diagnosis of pneumonia, and compares them with established validation methods.MethodsExpert knowledge was used to derive a risk prediction model for pneumonia. Data on more than 600 patients presenting with cough and fever at a general practitioner’s practice in Switzerland were collected in order to validate the expert model and to examine the predictive performance of it. Additionally, four modifications of the original model including shrinkage of the regression coefficients, and two Bayesian approaches with the expert model used as prior mean and different weights for the prior covariance matrix were fitted. We quantify the predictive performance of the different methods with respect to calibration and discrimination, using cross-validation.ResultsThe predictive performance of the unshrinked regression coefficients was poor when applied to the Swiss cohort. Shrinkage improved the results, but a Bayesian model formulation with unspecified weight of the informative prior lead to large AUC and small Brier score, naïve and after cross-validation. The advantage of this approach is the flexibility in case of a prior-data conflict.ConclusionsPublished risk prediction rules in clinical research need to be validated externally before they can be used in new settings. We propose to use a Bayesian model formulation with the original risk prediction rule as prior. The posterior means of the coefficients, given the validation data showed best predictive performance with respect to cross-validated calibration and discriminative ability.

[1]  J. Habbema,et al.  Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. , 2001, Journal of clinical epidemiology.

[2]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[3]  Martha Sajatovic,et al.  Clinical Prediction Models , 2013 .

[4]  Leonhard Held,et al.  Posterior and Cross-validatory Predictive Checks: A Comparison of MCMC and INLA , 2010 .

[5]  D. Cox Two further applications of a model for binary regression , 1958 .

[6]  E. Steyerberg Clinical Prediction Models , 2008, Statistics for Biology and Health.

[7]  Maarten Keijzer,et al.  Development and validation of clinical prediction models: marginal differences between logistic regression, penalized maximum likelihood estimation, and genetic programming. , 2012, Journal of clinical epidemiology.

[8]  Ulrike Held,et al.  Expansion of the prognostic assessment of patients with chronic obstructive pulmonary disease: the updated BODE index and the ADO index , 2009, The Lancet.

[9]  Sreeram V Ramagopalan,et al.  Risk of venous thromboembolism in people admitted to hospital with selected immune-mediated diseases: record-linkage study , 2011, BMC medicine.

[10]  O. Miettinen,et al.  Clinical diagnosis of pneumonia, typical of experts. , 2008, Journal of evaluation in clinical practice.

[11]  Gerhard Tutz,et al.  Statistical modelling and regression structures : festschrift in honour of Ludwig Fahrmeir , 2010 .

[12]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: validating a prognostic model , 2009, BMJ : British Medical Journal.

[13]  M. Weed How will we know if the London 2012 Olympics and Paralympics benefit health? , 2010, BMJ : British Medical Journal.

[14]  Y. Vergouwe,et al.  Development and validation of a prediction model for low hemoglobin deferral in a large cohort of whole blood donors , 2012, Transfusion.

[15]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[16]  H. Sox,et al.  Clinical prediction rules. Applications and methodological standards. , 1985, The New England journal of medicine.

[17]  J. Copas,et al.  Using regression models for prediction: shrinkage and regression to the mean , 1997, Statistical methods in medical research.

[18]  Yvonne Vergouwe,et al.  Development and validation of a prediction model with missing predictor data: a practical approach. , 2010, Journal of clinical epidemiology.

[19]  I. Scott Improving the accuracy of predicting cardiovascular risk , 2010, BMJ : British Medical Journal.

[20]  J. Steurer,et al.  A decision aid to rule out pneumonia and reduce unnecessary prescriptions of antibiotics in primary care patients with cough and fever , 2011, BMC medicine.

[21]  G. Guyatt,et al.  Clinical Prediction Rules , 2004 .

[22]  G E Fryer,et al.  The role of family practice in different health care systems: a comparison of reasons for encounter, diagnoses, and interventions in primary care populations in the Netherlands, Japan, Poland, and the United States. , 2002, The Journal of family practice.

[23]  Tempei Hashino,et al.  Sampling Uncertainty and Confidence Intervals for the Brier Score and Brier Skill Score , 2008 .

[24]  J. C. van Houwelingen,et al.  Shrinkage and Penalized Likelihood as Methods to Improve Predictive Accuracy , 2001 .

[25]  Ewout W Steyerberg,et al.  Validation and updating of predictive logistic regression models: a study on sample size and shrinkage , 2004, Statistics in medicine.