Random‐effects meta‐analysis of the clinical utility of tests and prediction models

The use of data from multiple studies or centers for the validation of a clinical test or a multivariable prediction model allows researchers to investigate the test's/model's performance in multiple settings and populations. Recently, meta-analytic techniques have been proposed to summarize discrimination and calibration across study populations. Here, we rather consider performance in terms of net benefit, which is a measure of clinical utility that weighs the benefits of true positive classifications against the harms of false positives. We posit that it is important to examine clinical utility across multiple settings of interest. This requires a suitable meta-analysis method, and we propose a Bayesian trivariate random-effects meta-analysis of sensitivity, specificity, and prevalence. Across a range of chosen harm-to-benefit ratios, this provides a summary measure of net benefit, a prediction interval, and an estimate of the probability that the test/model is clinically useful in a new setting. In addition, the prediction interval and probability of usefulness can be calculated conditional on the known prevalence in a new setting. The proposed methods are illustrated by 2 case studies: one on the meta-analysis of published studies on ear thermometry to diagnose fever in children and one on the validation of a multivariable clinical risk prediction model for the diagnosis of ovarian cancer in a multicenter dataset. Crucially, in both case studies the clinical utility of the test/model was heterogeneous across settings, limiting its usefulness in practice. This emphasizes that heterogeneity in clinical utility should be assessed before a test/model is routinely implemented.

[1]  Richard D Riley,et al.  Bayesian meta‐analytical methods to incorporate multiple surrogate endpoints in drug development process , 2015, Statistics in medicine.

[2]  M. Quigley,et al.  The assessment of the quality of reporting of meta-analyses in diagnostic research: a systematic review , 2011, BMC medical research methodology.

[3]  Benjamin R Saville,et al.  Decision curve analysis. , 2015, JAMA.

[4]  Johannes B Reitsma,et al.  Variation of a test’s sensitivity and specificity with disease prevalence , 2013, Canadian Medical Association Journal.

[5]  J. Ioannidis,et al.  External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. , 2015, Journal of clinical epidemiology.

[6]  Yvonne Vergouwe,et al.  External validity of risk models: Use of benchmark values to disentangle a case-mix effect from incorrect coefficients. , 2010, American journal of epidemiology.

[7]  Haitao Chu,et al.  Meta‐analysis of diagnostic accuracy studies accounting for disease prevalence: Alternative parameterizations and model selection , 2009, Statistics in medicine.

[8]  Paula R Williamson,et al.  Infrared ear thermometry compared with rectal thermometry in children: a systematic review , 2002, The Lancet.

[9]  Elena B. Elkin,et al.  Extensions to decision curve analysis, a novel method for evaluating diagnostic tests, prediction models and molecular markers , 2008, BMC Medical Informatics Decis. Mak..

[10]  M. Pencina,et al.  How to interpret a small increase in AUC with an additional risk prediction marker: decision analysis comes through , 2014, Statistics in medicine.

[11]  M. Gönen,et al.  A simple decision analytic solution to the comparison of two binary diagnostic tests , 2013, Statistics in medicine.

[12]  Ewout W Steyerberg,et al.  Net benefit approaches to the evaluation of prediction models, molecular markers, and diagnostic tests , 2016, British Medical Journal.

[13]  Yvonne Vergouwe,et al.  A calibration hierarchy for risk models was defined: from utopia to empirical data. , 2016, Journal of clinical epidemiology.

[14]  P. Leffers,et al.  The influence of referral patterns on the characteristics of diagnostic tests. , 1992, Journal of clinical epidemiology.

[15]  B. van Calster,et al.  Calibration of Risk Prediction Models , 2015, Medical decision making : an international journal of the Society for Medical Decision Making.

[16]  Brian H Willis,et al.  Estimating a test's accuracy using tailored meta-analysis-How setting-specific data may aid study selection. , 2014, Journal of clinical epidemiology.

[17]  T. Bourne,et al.  External Validation of Diagnostic Models to Estimate the Risk of Malignancy in Adnexal Masses , 2011, Clinical Cancer Research.

[18]  Richard D Riley,et al.  Bayesian bivariate meta-analysis of correlated effects: Impact of the prior distributions on the between-study correlation, borrowing of strength, and joint inferences , 2016, Statistical methods in medical research.

[19]  Mithat Gonen,et al.  Nomograms in oncology: more than meets the eye. , 2015, The Lancet. Oncology.

[20]  D Timmerman,et al.  Strategies to diagnose ovarian cancer: new evidence from phase 3 of the multicentre international IOTA study , 2014, British Journal of Cancer.

[21]  C. Hyde,et al.  What is the test's accuracy in my practice population? Tailored meta-analysis provides a plausible estimate. , 2015, Journal of clinical epidemiology.

[22]  Yinghui Wei,et al.  Bayesian multivariate meta‐analysis with multiple outcomes , 2013, Statistics in medicine.

[23]  Richard D. Riley,et al.  Summarising and validating test accuracy results across multiple studies for use in clinical practice , 2015, Statistics in medicine.

[24]  Richard D Riley,et al.  Multivariate meta-analysis of individual participant data helped externally validate the performance and implementation of a prediction model , 2016, Journal of clinical epidemiology.

[25]  Nicola J Cooper,et al.  Multivariate meta-analysis of mixed outcomes: a Bayesian approach , 2013, Statistics in medicine.

[26]  E. Elkin,et al.  Decision Curve Analysis: A Novel Method for Evaluating Prediction Models , 2006, Medical decision making : an international journal of the Society for Medical Decision Making.

[27]  Yvonne Vergouwe,et al.  Adaptation of Clinical Prediction Models for Application in Local Settings , 2012, Medical decision making : an international journal of the Society for Medical Decision Making.

[28]  Kathleen F. Kerr,et al.  Assessing the Clinical Impact of Risk Prediction Models With Decision Curves: Guidance for Correct Interpretation and Appropriate Use. , 2016, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[29]  J Hilden Prevalence-free utility-respecting summary indices of diagnostic power do not exist. , 2000, Statistics in medicine.

[30]  S Van Huffel,et al.  Ovarian cancer prediction in adnexal masses using ultrasound‐based logistic regression models: a temporal and external validation study by the IOTA group , 2010, Ultrasound in obstetrics & gynecology : the official journal of the International Society of Ultrasound in Obstetrics and Gynecology.

[31]  Haitao Chu,et al.  Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. , 2006, Journal of clinical epidemiology.

[32]  S. Goodman,et al.  Beyond the Usual Prediction Accuracy Metrics: Reporting Results for Clinical Decision Making , 2012, Annals of Internal Medicine.