A framework for meta-analysis of prediction model studies with binary and time-to-event outcomes

It is widely recommended that any developed—diagnostic or prognostic—prediction model is externally validated in terms of its predictive performance measured by calibration and discrimination. When multiple validations have been performed, a systematic review followed by a formal meta-analysis helps to summarize overall performance across multiple settings, and reveals under which circumstances the model performs suboptimal (alternative poorer) and may need adjustment. We discuss how to undertake meta-analysis of the performance of prediction models with either a binary or a time-to-event outcome. We address how to deal with incomplete availability of study-specific results (performance estimates and their precision), and how to produce summary estimates of the c-statistic, the observed:expected ratio and the calibration slope. Furthermore, we discuss the implementation of frequentist and Bayesian meta-analysis methods, and propose novel empirically-based prior distributions to improve estimation of between-study heterogeneity in small samples. Finally, we illustrate all methods using two examples: meta-analysis of the predictive performance of EuroSCORE II and of the Framingham Risk Score. All examples and meta-analysis models have been implemented in our newly developed R package “metamisc”.

[1]  Yvonne Vergouwe,et al.  External validity of risk models: Use of benchmark values to disentangle a case-mix effect from incorrect coefficients. , 2010, American journal of epidemiology.

[2]  D. Nieboer,et al.  Assessing Discriminative Performance at External Validation of Clinical Prediction Models , 2016, PloS one.

[3]  Theo Stijnen,et al.  Random effects meta‐analysis of event outcome in the framework of the generalized linear mixed model with applications in sparse data , 2010, Statistics in medicine.

[4]  Yvonne Vergouwe,et al.  Prognosis and prognostic research: what, why, and how? , 2009, BMJ : British Medical Journal.

[5]  Wolfgang Viechtbauer,et al.  Conducting Meta-Analyses in R with the metafor Package , 2010 .

[6]  Sander Greenland,et al.  Bayesian perspectives for epidemiological research. II. Regression analysis. , 2007, International journal of epidemiology.

[7]  J. Emparanza,et al.  Design Characteristics Influence Performance of Clinical Prediction Rules in Validation: A Meta-Epidemiological Study , 2016, PloS one.

[8]  Hannah R Rothstein,et al.  A basic introduction to fixed‐effect and random‐effects models for meta‐analysis , 2010, Research synthesis methods.

[9]  Richard D Riley,et al.  Interpretation of random effects meta-analyses , 2011, BMJ : British Medical Journal.

[10]  Peter C Austin,et al.  Estimating Multilevel Logistic Regression Models When the Number of Clusters is Low: A Comparison of Different Statistical Software Procedures , 2010, The international journal of biostatistics.

[11]  Dean Langan,et al.  Comparative performance of heterogeneity variance estimators in meta‐analysis: a review of simulation studies , 2016, Research synthesis methods.

[12]  N. Laird,et al.  Meta-analysis in clinical trials. , 1986, Controlled clinical trials.

[13]  G. Collins,et al.  Prediction models for cardiovascular disease risk in the general population: systematic review , 2016, British Medical Journal.

[14]  D. Cox Two further applications of a model for binary regression , 1958 .

[15]  G. Collins,et al.  External validation of multivariable prediction models: a systematic review of methodological conduct and reporting , 2014, BMC Medical Research Methodology.

[16]  David J Spiegelhalter,et al.  A re-evaluation of random-effects meta-analysis , 2009, Journal of the Royal Statistical Society. Series A,.

[17]  Karel G M Moons,et al.  A new framework to enhance the interpretation of external validation studies of clinical prediction models. , 2015, Journal of clinical epidemiology.

[18]  Clemens Elster,et al.  Bayesian estimation in random effects meta‐analysis using a non‐informative prior , 2017, Statistics in medicine.

[19]  Karel G M Moons,et al.  Meta‐analysis and aggregation of multiple published prediction models , 2014, Statistics in medicine.

[20]  Richard D. Riley,et al.  Random effects meta‐analysis: Coverage performance of 95% confidence and prediction intervals following REML estimation , 2016, Statistics in medicine.

[21]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[22]  M. Woodward,et al.  Risk prediction models: II. External validation, model updating, and impact assessment , 2012, Heart.

[23]  Patrick Royston,et al.  Discrimination-based sample size calculations for multivariable prognostic models for time-to-event data , 2015, BMC Medical Research Methodology.

[24]  M. Leeflang,et al.  Search Filters for Finding Prognostic and Diagnostic Prediction Studies in Medline to Enhance Systematic Reviews , 2012, PloS one.

[25]  Douglas G Altman,et al.  How to obtain the P value from a confidence interval , 2011, BMJ : British Medical Journal.

[26]  Ewout W Steyerberg,et al.  Validation and updating of predictive logistic regression models: a study on sample size and shrinkage , 2004, Statistics in medicine.

[27]  R. Newcombe,et al.  Confidence intervals for an effect size measure based on the Mann–Whitney statistic. Part 2: asymptotic methods and evaluation , 2006, Statistics in medicine.

[28]  Roger Newson,et al.  Parameters behind “Nonparametric” Statistics: Kendall's tau, Somers’ D and Median Differences , 2002 .

[29]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[30]  Theo Stijnen,et al.  Advanced methods in meta‐analysis: multivariate approach and meta‐regression , 2002, Statistics in medicine.

[31]  I. White,et al.  Covariate-adjusted measures of discrimination for survival data , 2014, Biometrical journal. Biometrische Zeitschrift.

[32]  Ewout W Steyerberg,et al.  Interpreting the concordance statistic of a logistic regression model: relation to the variance and odds ratio of a continuous explanatory variable , 2012, BMC Medical Research Methodology.

[33]  Cynthia D Mulrow,et al.  Random-Effects Meta-analysis of Inconsistent Effects: A Time for Change , 2014, Annals of Internal Medicine.

[34]  Patrick Royston,et al.  A new measure of prognostic separation in survival data , 2004, Statistics in medicine.

[35]  R. Whitlock,et al.  Performance of the European System for Cardiac Operative Risk Evaluation II: a meta-analysis of 22 studies involving 145,592 cardiac surgery procedures. , 2014, The Journal of thoracic and cardiovascular surgery.

[36]  D. Levy,et al.  Prediction of coronary heart disease using risk factor categories. , 1998, Circulation.

[37]  Dan Jackson,et al.  Predictive distributions for between-study heterogeneity and simple methods for their application in Bayesian meta-analysis , 2014, Statistics in medicine.

[38]  Samer A M Nashef,et al.  EuroSCORE II. , 2012, European journal of cardio-thoracic surgery : official journal of the European Association for Cardio-thoracic Surgery.

[39]  Mark Woodward,et al.  Assessing Risk Prediction Models Using Individual Participant Data From Multiple Studies , 2013, American journal of epidemiology.

[40]  G. Oehlert A note on the delta method , 1992 .

[41]  F. Buitrago,et al.  Original and REGICOR Framingham Functions in a Nondiabetic Population of a Spanish Health Care Center: A Validation Study , 2011, The Annals of Family Medicine.

[42]  M. Mazumdar,et al.  Impact of correlation of predictors on discrimination of risk models in development and external populations , 2017, BMC Medical Research Methodology.

[43]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[44]  A. Gelman Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper) , 2004 .

[45]  G. Collins,et al.  Critical Appraisal and Data Extraction for Systematic Reviews of Prediction Modelling Studies: The CHARMS Checklist , 2014, PLoS medicine.

[46]  Martyn Plummer,et al.  JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , 2003 .

[47]  E. Steyerberg,et al.  Prognosis Research Strategy (PROGRESS) 3: Prognostic Model Research , 2013, PLoS medicine.

[48]  M. A. Best Bayesian Approaches to Clinical Trials and Health‐Care Evaluation , 2005 .

[49]  Thiago G. Martins,et al.  Penalising Model Component Complexity: A Principled, Practical Approach to Constructing Priors , 2014, 1403.4630.

[50]  P. Gustafson,et al.  Conservative prior distributions for variance parameters in hierarchical models , 2006 .

[51]  S. Lirette,et al.  Quantifying predictive accuracy in survival models , 2017, Journal of Nuclear Cardiology.

[52]  Ralf Bender,et al.  Methods to estimate the between‐study variance and its uncertainty in meta‐analysis† , 2015, Research synthesis methods.

[53]  L. Hooft,et al.  A guide to systematic review and meta-analysis of prediction model performance , 2017, British Medical Journal.

[54]  E. Steyerberg,et al.  Reporting and Methods in Clinical Prediction Research: A Systematic Review , 2012, PLoS medicine.

[55]  N. Freemantle,et al.  The new EuroSCORE II does not improve prediction of mortality in high-risk patients undergoing cardiac surgery: a collaborative analysis of two European centres. , 2013, European journal of cardio-thoracic surgery : official journal of the European Association for Cardio-thoracic Surgery.

[56]  Jean-François Dartigues,et al.  Review and comparison of ROC curve estimators for a time-dependent outcome with marker-dependent censoring. , 2013, Biometrical journal. Biometrische Zeitschrift.

[57]  Richard D Riley,et al.  Meta-analysis of prediction model performance across multiple studies: Which scale helps ensure between-study normality for the C-statistic and calibration measures? , 2017, Statistical methods in medical research.

[58]  David R. Jones,et al.  How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS , 2005, Statistics in medicine.

[59]  Ewout W Steyerberg,et al.  Predictive accuracy of novel risk factors and markers: A simulation study of the sensitivity of different performance measures for the Cox proportional hazards regression model , 2015, Statistical methods in medical research.

[60]  Richard D Riley,et al.  Bayesian bivariate meta-analysis of correlated effects: Impact of the prior distributions on the between-study correlation, borrowing of strength, and joint inferences , 2016, Statistical methods in medical research.

[61]  Sander Greenland,et al.  Bayesian perspectives for epidemiological research: I. Foundations and basic methods. , 2006, International journal of epidemiology.

[62]  Mithat Gönen,et al.  A new concordance measure for risk prediction models in external validation settings , 2016, Statistics in medicine.

[63]  Orestis Efthimiou,et al.  Get real in individual participant data (IPD) meta‐analysis: a review of the methodology , 2015, Research synthesis methods.

[64]  Kurex Sidik,et al.  A comparison of heterogeneity variance estimators in combining results of studies , 2007, Statistics in medicine.

[65]  R. Kolamunnage-Dona,et al.  Time-dependent ROC curve analysis in medical research: current methods and applications , 2017, BMC Medical Research Methodology.