A new framework to enhance the interpretation of external validation studies of clinical prediction models.

OBJECTIVES It is widely acknowledged that the performance of diagnostic and prognostic prediction models should be assessed in external validation studies with independent data from "different but related" samples as compared with that of the development sample. We developed a framework of methodological steps and statistical methods for analyzing and enhancing the interpretation of results from external validation studies of prediction models. STUDY DESIGN AND SETTING We propose to quantify the degree of relatedness between development and validation samples on a scale ranging from reproducibility to transportability by evaluating their corresponding case-mix differences. We subsequently assess the models' performance in the validation sample and interpret the performance in view of the case-mix differences. Finally, we may adjust the model to the validation setting. RESULTS We illustrate this three-step framework with a prediction model for diagnosing deep venous thrombosis using three validation samples with varying case mix. While one external validation sample merely assessed the model's reproducibility, two other samples rather assessed model transportability. The performance in all validation samples was adequate, and the model did not require extensive updating to correct for miscalibration or poor fit to the validation settings. CONCLUSION The proposed framework enhances the interpretation of findings at external validation of prediction models.

[1]  John P. A. Ioannidis,et al.  Research: increasing value, reducing waste 2 , 2014 .

[2]  J. André Knottnerus,et al.  Prediction Rules , 1992 .

[3]  Diagnostic prediction rules: principles, requirements and pitfalls. , 1995 .

[4]  P. O'Brien,et al.  Comparing Two Samples: Extensions of the t, Rank-Sum, and Log-Rank Tests , 1988 .

[5]  M. Woodward,et al.  Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker , 2012, Heart.

[6]  G. Collins,et al.  Identifying patients with undetected colorectal cancer: an independent validation of QCancer (Colorectal) , 2012, British Journal of Cancer.

[7]  A. R. de Leon,et al.  A generalized Mahalanobis distance for mixed data , 2005 .

[8]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: what, why, and how? , 2009, BMJ : British Medical Journal.

[9]  Yasuo Ohashi,et al.  Overlap coefficient for assessing the similarity of pharmacokinetic data between ethnically different populations , 2005, Clinical trials.

[10]  D. Moher,et al.  CONSORT 2010 Statement: updated guidelines for reporting parallel group randomised trials , 2010, BMC medicine.

[11]  E. Steyerberg,et al.  Prognosis Research Strategy (PROGRESS) 3: Prognostic Model Research , 2013, PLoS medicine.

[12]  M. Maclure,et al.  Popperian refutation in epidemiology. , 1985, American journal of epidemiology.

[13]  W. Ageno,et al.  The Wells rule was not useful in ruling out deep venous thrombosis in a primary care setting. , 2006, Evidence-based medicine.

[14]  A. Leona,et al.  A generalized Mahalanobis distance for mixed data , 2004 .

[15]  J. Hilden The Area under the ROC Curve and Its Competitors , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[16]  G. Collins,et al.  External validation of multivariable prediction models: a systematic review of methodological conduct and reporting , 2014, BMC Medical Research Methodology.

[17]  Yvonne Vergouwe,et al.  Validity of prognostic models: when is a model clinically useful? , 2002, Seminars in urologic oncology.

[18]  Tom Fawcett,et al.  ROC Graphs: Notes and Practical Considerations for Researchers , 2007 .

[19]  C J McDonald,et al.  Validation of Probabilistic Predictions , 1993, Medical decision making : an international journal of the Society for Medical Decision Making.

[20]  M. Woodward,et al.  Risk prediction models: II. External validation, model updating, and impact assessment , 2012, Heart.

[21]  Karel Moons,et al.  The Wells Rule Does Not Adequately Rule Out Deep Venous Thrombosis in Primary Care Patients , 2005, Annals of Internal Medicine.

[22]  G. Bedogni,et al.  Clinical Prediction Models—a Practical Approach to Development, Validation and Updating , 2009 .

[23]  Crystal M Smith-Spangler,et al.  Transparency and Reproducible Research in Modeling , 2012, Medical decision making : an international journal of the Society for Medical Decision Making.

[24]  H C van Houwelingen,et al.  Validation, calibration, revision and combination of prognostic survival models. , 2000, Statistics in medicine.

[25]  J. Wyatt,et al.  Commentary: Prognostic models: clinically useful or quickly forgotten? , 1995 .

[26]  Ewout W Steyerberg,et al.  Validation and updating of predictive logistic regression models: a study on sample size and shrinkage , 2004, Statistics in medicine.

[27]  D. Levy,et al.  Prediction of coronary heart disease using risk factor categories. , 1998, Circulation.

[28]  P. Royston,et al.  Prognosis and prognostic research: application and impact of prognostic models in clinical practice , 2009, BMJ : British Medical Journal.

[29]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[30]  J. Lucas,et al.  Theory-Testing, Generalization, and the Problem of External Validity* , 2003 .

[31]  Mitchell H Gail,et al.  On criteria for evaluating models of absolute risk. , 2005, Biostatistics.

[32]  Edwin L. Bradley,et al.  A nonparametric measure of the overlapping coefficient , 2000 .

[33]  Hans C. van Houwelingen,et al.  Validation, calibration, revision and combination of prognostic survival models , 2000 .

[34]  J. Knottnerus Diagnostic prediction rules: principles, requirements and pitfalls. , 1995, Primary care.

[35]  Johannes B Reitsma,et al.  Overinterpretation and misreporting of diagnostic accuracy studies: evidence of "spin". , 2013, Radiology.

[36]  Karel G M Moons,et al.  A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta‐analysis , 2013, Statistics in medicine.

[37]  K G M Moons,et al.  Exclusion of deep vein thrombosis using the Wells rule in clinically important subgroups: individual patient data meta-analysis , 2014, BMJ : British Medical Journal.

[38]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: validating a prognostic model , 2009, BMJ : British Medical Journal.

[39]  Y Vergouwe,et al.  A new diagnostic rule for deep vein thrombosis: safety and efficiency in clinically relevant subgroups. , 2007, Family practice.

[40]  Alice M. Tybout,et al.  The Concept of External Validity , 1982 .

[41]  Brian H Willis,et al.  Spectrum bias--why clinicians need to be cautious when applying diagnostic test studies. , 2008, Family practice.

[42]  S D Walter,et al.  Variation in baseline risk as an explanation of heterogeneity in meta-analysis. , 1997, Statistics in medicine.

[43]  M. Gail,et al.  Projecting individualized probabilities of developing breast cancer for white females who are being examined annually. , 1989, Journal of the National Cancer Institute.

[44]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[45]  Yvonne Vergouwe,et al.  External validity of risk models: Use of benchmark values to disentangle a case-mix effect from incorrect coefficients. , 2010, American journal of epidemiology.

[46]  Juan Lu,et al.  Edinburgh Research Explorer Prediction of outcome after moderate and severe traumatic brain injury , 2022 .

[47]  D. Cox Two further applications of a model for binary regression , 1958 .

[48]  C. Gross,et al.  Reporting the Recruitment Process in Clinical Trials: Who Are These Patients and How Did They Get There? , 2002, Annals of Internal Medicine.

[49]  D G Altman,et al.  What do we mean by validating a prognostic model? , 2000, Statistics in medicine.

[50]  E. Steyerberg,et al.  Reporting and Methods in Clinical Prediction Research: A Systematic Review , 2012, PLoS medicine.

[51]  Y Vergouwe,et al.  Updating methods improved the performance of a clinical prediction model in new patients. , 2008, Journal of clinical epidemiology.

[52]  Sunil J Rao,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2003 .

[53]  S. Hailpern,et al.  Odds Ratios and Logistic Regression: Further Examples of their use and Interpretation , 2003 .

[54]  A. Evans,et al.  Translating Clinical Research into Clinical Practice: Impact of Using Prediction Rules To Make Decisions , 2006, Annals of Internal Medicine.

[55]  R. Tibshirani,et al.  Increasing value and reducing waste in research design, conduct, and analysis , 2014, The Lancet.

[56]  Yvonne Vergouwe,et al.  A simple method to adjust clinical prediction models to local circumstances , 2009, Canadian journal of anaesthesia = Journal canadien d'anesthesie.

[57]  K. Covinsky,et al.  Assessing the Generalizability of Prognostic Information , 1999, Annals of Internal Medicine.

[58]  Y. Vergouwe,et al.  Validation, updating and impact of clinical prediction rules: a review. , 2008, Journal of clinical epidemiology.