Statistical methods for multivariate meta-analysis of diagnostic tests: An overview and tutorial

In this article, we present an overview and tutorial of statistical methods for meta-analysis of diagnostic tests under two scenarios: (1) when the reference test can be considered a gold standard and (2) when the reference test cannot be considered a gold standard. In the first scenario, we first review the conventional summary receiver operating characteristics approach and a bivariate approach using linear mixed models. Both approaches require direct calculations of study-specific sensitivities and specificities. We next discuss the hierarchical summary receiver operating characteristics curve approach for jointly modeling positivity criteria and accuracy parameters, and the bivariate generalized linear mixed models for jointly modeling sensitivities and specificities. We further discuss the trivariate generalized linear mixed models for jointly modeling prevalence, sensitivities and specificities, which allows us to assess the correlations among the three parameters. These approaches are based on the exact binomial distribution and thus do not require an ad hoc continuity correction. Lastly, we discuss a latent class random effects model for meta-analysis of diagnostic tests when the reference test itself is imperfect for the second scenario. A number of case studies with detailed annotated SAS code in MIXED and NLMIXED procedures are presented to facilitate the implementation of these approaches.

[1]  Kevin M. Small,et al.  Evaluating Practices and Developing Tools for Comparative Effectiveness Reviews of Diagnostic Test Accuracy , 2013 .

[2]  Yemisi Takwoingi,et al.  Empirical Evidence of the Importance of Comparative Studies of Diagnostic Test Accuracy , 2013, Annals of Internal Medicine.

[3]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[4]  Frederick Mosteller,et al.  Guidelines for Meta-analyses Evaluating Diagnostic Tests , 1994, Annals of Internal Medicine.

[5]  G. Lu,et al.  Assessing Evidence Inconsistency in Mixed Treatment Comparisons , 2006 .

[6]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[7]  X H Zhou,et al.  Correcting for verification bias in studies of a diagnostic test's accuracy , 1998, Statistical methods in medical research.

[8]  K J M Janssen,et al.  Multiple imputation to correct for partial verification bias revisited , 2008, Statistics in medicine.

[9]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[10]  Steve Halligan,et al.  Systematic reviews of diagnostic tests in cancer: review of methods and reporting , 2006, BMJ : British Medical Journal.

[11]  N. Dendukuri,et al.  Commercial Serological Tests for the Diagnosis of Active Pulmonary and Extrapulmonary Tuberculosis: An Updated Systematic Review and Meta-Analysis , 2011, PLoS medicine.

[12]  Theo Stijnen,et al.  The binomial distribution of meta-analysis was preferred to model within-study variability. , 2008, Journal of clinical epidemiology.

[13]  W. Gilks,et al.  Adaptive Rejection Metropolis Sampling Within Gibbs Sampling , 1995 .

[14]  David B. Dunson,et al.  Bayesian Data Analysis , 2010 .

[15]  David R. Jones,et al.  How vague is vague? A simulation study of the impact of the use of vague prior distributions in MCMC using WinBUGS , 2005, Statistics in medicine.

[16]  Haitao Chu,et al.  Meta‐analysis of diagnostic accuracy studies accounting for disease prevalence: Alternative parameterizations and model selection , 2009, Statistics in medicine.

[17]  Alexander J Sutton,et al.  What to add to nothing? Use and avoidance of continuity corrections in meta-analysis of sparse data. , 2004, Statistics in medicine.

[18]  Bradley P Carlin,et al.  Network meta-analysis of randomized clinical trials: Reporting the proper summaries , 2014, Clinical trials.

[19]  D. Rubin,et al.  Inference from Iterative Simulation Using Multiple Sequences , 1992 .

[20]  Johannes B Reitsma,et al.  Bivariate analysis of sensitivity and specificity produces informative summary measures in diagnostic reviews. , 2005, Journal of clinical epidemiology.

[21]  Johannes B Reitsma,et al.  Evidence of bias and variation in diagnostic accuracy studies , 2006, Canadian Medical Association Journal.

[22]  M. Kalra,et al.  Characterization of adrenal masses by using FDG PET: a systematic review and meta-analysis of diagnostic test performance. , 2011, Radiology.

[23]  Kristian Thorlund,et al.  The effects of excluding treatments from network meta-analyses: survey , 2013, BMJ : British Medical Journal.

[24]  S D Walter,et al.  The partial area under the summary ROC curve , 2005, Statistics in medicine.

[25]  R. Diel,et al.  Interferon-&ggr; release assays for the diagnosis of latent Mycobacterium tuberculosis infection: a systematic review and meta-analysis , 2010, European Respiratory Journal.

[26]  Theo Stijnen,et al.  Advanced methods in meta‐analysis: multivariate approach and meta‐regression , 2002, Statistics in medicine.

[27]  P. Sen,et al.  Some Reasons for Not Using the Yates Continuity Correction on 2 × 2 Contingency Tables: Comment , 1974 .

[28]  R A Greenes,et al.  The influence of uninterpretability on the assessment of diagnostic tests. , 1986, Journal of chronic diseases.

[29]  Haitao Chu,et al.  Bivariate meta-analysis of sensitivity and specificity with sparse data: a generalized linear mixed model approach. , 2006, Journal of clinical epidemiology.

[30]  Roger M Harbord,et al.  A unification of models for meta-analysis of diagnostic accuracy studies. , 2007, Biostatistics.

[31]  C. Hing,et al.  Diagnostic accuracy of ultrasound for rotator cuff tears in adults: a systematic review and meta-analysis. , 2011, Clinical radiology.

[32]  Andrew Thomas,et al.  WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility , 2000, Stat. Comput..

[33]  Iven Van Mechelen,et al.  Visualizing Distributions of Covariance Matrices ∗ , 2011 .

[34]  S. Walter,et al.  Properties of the summary receiver operating characteristic (SROC) curve for diagnostic test data , 2002, Statistics in medicine.

[35]  T. Stijnen,et al.  Review: a gentle introduction to imputation of missing values. , 2006, Journal of clinical epidemiology.

[36]  M. Bissell Interferon-γ release assays for the diagnosis of latent Mycobacterium tuberculosis infection: a systematic review and meta-analysis , 2012 .

[37]  Patrick M Bossuyt,et al.  We should not pool diagnostic likelihood ratios in systematic reviews , 2008, Statistics in medicine.

[38]  Jonathan J Deeks,et al.  Systematic reviews in health care: Systematic reviews of evaluations of diagnostic and screening tests. , 2001, BMJ.

[39]  Nicola J Cooper,et al.  Integration of Meta-analysis and Economic Decision Modeling for Evaluating Diagnostic Tests , 2008, Medical decision making : an international journal of the Society for Medical Decision Making.

[40]  S G Baker,et al.  Evaluating multiple diagnostic tests with partial verification. , 1995, Biometrics.

[41]  Joris A H de Groot,et al.  Verification problems in diagnostic accuracy studies: consequences and solutions , 2011, BMJ : British Medical Journal.

[42]  Ajit Lalvani,et al.  Diagnosis of tuberculosis in South African children with a T cell-based assay: a prospective cohort study , 2004, The Lancet.

[43]  K. Shin Partial-Thickness Rotator Cuff Tears , 2011, The Korean journal of pain.

[44]  Huiping Xu,et al.  A Probit Latent Class Model with General Correlation Structures for Evaluating Accuracy of Diagnostic Tests , 2009, Biometrics.

[45]  Mohsen Sadatsafavi,et al.  A statistical method was used for the meta-analysis of tests for latent TB in the absence of a gold standard, combining random-effect and latent-class methods to estimate test accuracy. , 2010, Journal of clinical epidemiology.

[46]  Bradley P. Carlin,et al.  Bayesian Methods for Data Analysis , 2008 .

[47]  P. Bossuyt,et al.  BMC Medical Research Methodology , 2002 .

[48]  Georgia Salanti,et al.  Graphical methods and numerical summaries for presenting results from multiple-treatment meta-analysis: an overview and tutorial. , 2011, Journal of clinical epidemiology.

[49]  A. Hinsche,et al.  Shoulder Ultrasonography versus Arthroscopy for the Detection of Rotator Cuff Tears: Analysis of Errors , 2011, Journal of orthopaedic surgery.

[50]  Roderick J. A. Little,et al.  Modeling the Drop-Out Mechanism in Repeated-Measures Studies , 1995 .

[51]  Andrew Gelman,et al.  General methods for monitoring convergence of iterative simulations , 1998 .

[52]  Jialiang Li,et al.  Assessing the dependence of sensitivity and specificity on prevalence in meta-analysis. , 2011, Biostatistics.

[53]  Qinshu Lian Statistical methods for multivariate meta-analysis , 2018 .

[54]  R. Myers,et al.  FibroTest and FibroScan for the Prediction of Hepatitis C-Related Fibrosis: A Systematic Review of Diagnostic Test Accuracy , 2007, The American Journal of Gastroenterology.

[55]  C B Begg,et al.  A General Regression Methodology for ROC Curve Estimation , 1988, Medical decision making : an international journal of the Society for Medical Decision Making.

[56]  A. Feinstein,et al.  Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. , 1978, The New England journal of medicine.

[57]  G. Lu,et al.  Combination of direct and indirect evidence in mixed treatment comparisons , 2004, Statistics in medicine.

[58]  Karel G M Moons,et al.  Detection of lymph node metastases by gadolinium-enhanced magnetic resonance imaging: systematic review and meta-analysis. , 2010, Journal of the National Cancer Institute.

[59]  William A. Ghali,et al.  Statistical methods for the meta-analysis of diagnostic tests must take into account the use of surrogate standards. , 2013, Journal of clinical epidemiology.

[60]  Xiao-Hua Zhou,et al.  Statistical Methods in Diagnostic Medicine , 2002 .

[61]  Roderick J. A. Little,et al.  Statistical Analysis with Missing Data , 1988 .

[62]  Richard D Riley,et al.  Beyond the Bench: Hunting Down Fugitive Literature , 2004, Environmental Health Perspectives.

[63]  Johannes B Reitsma,et al.  American Journal of Epidemiology Practice of Epidemiology Adjusting for Partial Verification or Workup Bias in Meta-analyses of Diagnostic Accuracy Studies , 2022 .

[64]  J. C. Houwelingen,et al.  Bivariate Random Effects Meta-Analysis of ROC Curves , 2008, Medical decision making : an international journal of the Society for Medical Decision Making.

[65]  Nandini Dendukuri,et al.  Bayesian Meta‐Analysis of the Accuracy of a Test for Tuberculous Pleuritis in the Absence of a Gold Standard Reference , 2012, Biometrics.

[66]  R. Little Pattern-Mixture Models for Multivariate Incomplete Data , 1993 .

[67]  R A Greenes,et al.  Assessment of diagnostic tests when disease verification is subject to selection bias. , 1983, Biometrics.

[68]  Xiao-Li Meng,et al.  Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage , 2000 .

[69]  R. Perera,et al.  Diagnostic accuracy studies: how to report and analyse inconclusive test results , 2013, BMJ.

[70]  P. Prandoni,et al.  D-dimer testing as an adjunct to ultrasonography in patients with clinically suspected deep vein thrombosis: prospective cohort study , 1998, BMJ.

[71]  Haitao Chu,et al.  A unification of models for meta-analysis of diagnostic accuracy studies. , 2009, Biostatistics.

[72]  Dimitris Mavridis,et al.  Joint synthesis of multiple correlated outcomes in networks of interventions , 2014, Biostatistics.

[73]  L E Moses,et al.  Combining independent studies of a diagnostic test into a summary ROC curve: data-analytic approaches and some additional considerations. , 1993, Statistics in medicine.

[74]  Adrian F. M. Smith,et al.  Sampling-Based Approaches to Calculating Marginal Densities , 1990 .

[75]  C M Rutter,et al.  A hierarchical regression approach to meta‐analysis of diagnostic test accuracy evaluations , 2001, Statistics in medicine.

[76]  E. DeLong,et al.  Intermediate, Indeterminate, and Uninterpretable Diagnostic Test Results , 1987, Medical decision making : an international journal of the Society for Medical Decision Making.

[77]  Martyn Plummer,et al.  JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling , 2003 .

[78]  Tania B. Huedo-Medina,et al.  Assessing heterogeneity in meta-analysis: Q statistic or I2 index? , 2006, Psychological methods.

[79]  Petra Macaskill,et al.  Empirical Bayes estimates generated in a hierarchical summary ROC analysis agreed closely with those of a full Bayesian analysis. , 2004, Journal of clinical epidemiology.

[80]  Johannes B Reitsma,et al.  Case-control and two-gate designs in diagnostic accuracy studies. , 2005, Clinical chemistry.

[81]  H. Chu,et al.  A trivariate meta-analysis of diagnostic studies accounting for prevalence and non-evaluable subjects: re-evaluation of the meta-analysis of coronary CT angiography studies , 2014, BMC Medical Research Methodology.

[82]  A Bayesian Approach for Assessing Heterogeneity in Generalized Linear Models , 2007 .

[83]  Patrick M M Bossuyt,et al.  Diagnostic test accuracy may vary with prevalence: implications for evidence-based diagnosis. , 2009, Journal of clinical epidemiology.

[84]  Ofer Harel,et al.  Multiple imputation for correcting verification bias , 2006, Statistics in medicine.

[85]  P. Bossuyt,et al.  Empirical evidence of design-related bias in studies of diagnostic tests. , 1999, JAMA.

[86]  Madhukar Pai,et al.  Systematic Review: T-Cellbased Assays for the Diagnosis of Latent Tuberculosis Infection: An Update , 2008, Annals of Internal Medicine.

[87]  Georgia Salanti,et al.  Evaluation of networks of randomized trials , 2008, Statistical methods in medical research.

[88]  Alex J Sutton,et al.  Asymmetric funnel plots and publication bias in meta-analyses of diagnostic accuracy. , 2002, International journal of epidemiology.

[89]  M. Tan,et al.  Random effects models in latent class analysis for evaluating accuracy of diagnostic tests. , 1996, Biometrics.

[90]  Bradley P Carlin,et al.  Hierarchical Bayesian approaches for detecting inconsistency in network meta‐analysis , 2016, Statistics in medicine.

[91]  R. Kass,et al.  Nonconjugate Bayesian Estimation of Covariance Matrices and its Use in Hierarchical Models , 1999 .

[92]  P C Lambert,et al.  An evaluation of bivariate random‐effects meta‐analysis for the joint synthesis of two correlated outcomes , 2007, Statistics in medicine.

[93]  Thomas A Louis,et al.  Random Effects Models in a Meta-Analysis of the Accuracy of Two Diagnostic Tests Without a Gold Standard , 2009, Journal of the American Statistical Association.

[94]  Johannes B Reitsma,et al.  Meta-Analysis of Diagnostic Studies: A Comparison of Random Intercept, Normal-Normal, and Binomial-Normal Bivariate Summary ROC Approaches , 2008, Medical decision making : an international journal of the Society for Medical Decision Making.

[95]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[96]  Huiman X Barnhart,et al.  Accounting for Nonignorable Verification Bias in Assessment of Diagnostic Tests , 2003, Biometrics.

[97]  Haitao Chu,et al.  Bivariate Random Effects Meta-Analysis of Diagnostic Studies Using Generalized Linear Mixed Models , 2010 .

[98]  S. Nazir,et al.  Evaluation of diagnostic strategies for bladder cancer using computed tomography (CT) urography, flexible cystoscopy and voided urine cytology: results for 778 patients from a hospital haematuria clinic , 2012, BJU international.

[99]  Yemisi Takwoingi,et al.  MetaDAS: A SAS macro for meta-analysis of diagnostic accuracy studies , 2010 .

[100]  Enrique R. Venta,et al.  The Diagnosis of Deep-Vein Thrombosis: An Application of Decision Analysis , 1987 .

[101]  Rebecca M Turner,et al.  Characteristics of meta-analyses and their component studies in the Cochrane Database of Systematic Reviews: a cross-sectional, descriptive analysis , 2011, BMC medical research methodology.

[102]  Richard D Riley,et al.  An alternative model for bivariate random-effects meta-analysis when the within-study correlations are unknown. , 2008, Biostatistics.

[103]  Jialiang Li,et al.  Prevalence‐dependent diagnostic accuracy measures , 2007, Statistics in medicine.

[104]  Peter Schlattmann,et al.  Use of 3×2 tables with an intention to diagnose approach to assess clinical performance of diagnostic tests: meta-analytical evaluation of coronary CT angiography studies , 2012, BMJ : British Medical Journal.

[105]  N Waugh,et al.  A systematic review of rapid diagnostic tests for the detection of tuberculosis infection. , 2007, Health technology assessment.

[106]  P. Qiu The Statistical Evaluation of Medical Tests for Classification and Prediction , 2005 .

[107]  William J. Browne,et al.  Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models , 2000, Comput. Stat..

[108]  H Brenner,et al.  Variation of sensitivity, specificity, likelihood ratios and predictive values with disease prevalence. , 1997, Statistics in medicine.

[109]  L Irwig,et al.  Meta-analysis of Pap test accuracy. , 1995, American journal of epidemiology.

[110]  S. Wyatt,et al.  Diagnosis, investigation, and management of deep vein thrombosis , 2003, BMJ : British Medical Journal.

[111]  S D Walter,et al.  Meta-analysis of diagnostic tests with imperfect reference standards. , 1999, Journal of clinical epidemiology.

[112]  Bradley P Carlin,et al.  A Bayesian missing data framework for generalized multiple outcome mixed treatment comparisons , 2016, Research synthesis methods.

[113]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .