Meta-analysis of the technical performance of an imaging procedure: Guidelines and statistical methodology

Medical imaging serves many roles in patient care and the drug approval process, including assessing treatment response and guiding treatment decisions. These roles often involve a quantitative imaging biomarker, an objectively measured characteristic of the underlying anatomic structure or biochemical process derived from medical images. Before a quantitative imaging biomarker is accepted for use in such roles, the imaging procedure to acquire it must undergo evaluation of its technical performance, which entails assessment of performance metrics such as repeatability and reproducibility of the quantitative imaging biomarker. Ideally, this evaluation will involve quantitative summaries of results from multiple studies to overcome limitations due to the typically small sample sizes of technical performance studies and/or to include a broader range of clinical settings and patient populations. This paper is a review of meta-analysis procedures for such an evaluation, including identification of suitable studies, statistical methodology to evaluate and summarize the performance metrics, and complete and transparent reporting of the results. This review addresses challenges typical of meta-analyses of technical performance, particularly small study sizes, which often causes violations of assumptions underlying standard meta-analysis techniques. Alternative approaches to address these difficulties are also presented; simulation studies indicate that they outperform standard techniques when some studies are small. The meta-analysis procedures presented are also applied to actual [18F]-fluorodeoxyglucose positron emission tomography (FDG-PET) test–retest repeatability data for illustrative purposes.

[1]  David Moher,et al.  Meta-analysis of Observational Studies in Epidemiology , 2000 .

[2]  D. Cook,et al.  Systematic Reviews: Synthesis of Best Evidence for Clinical Decisions , 1997, Annals of Internal Medicine.

[3]  I. Olkin,et al.  Meta-analysis of observational studies in epidemiology - A proposal for reporting , 2000 .

[4]  I. Olkin,et al.  Improving the quality of reports of meta-analyses of randomised controlled trials: the QUOROM statement , 1999, The Lancet.

[5]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[6]  D. Lynch,et al.  The National Lung Screening Trial: overview and study design. , 2011, Radiology.

[7]  Daniel P Barboriak,et al.  Magnetic resonance assessment of response to therapy: tumor change measurement, truth data and error sources. , 2009, Translational oncology.

[8]  D. Sullivan,et al.  A collaborative enterprise for multi-stakeholder participation in the advancement of quantitative imaging. , 2011, Radiology.

[9]  T Stijnen,et al.  Baseline risk as predictor of treatment benefit: three clinical meta-re-analyses. , 2000, Statistics in medicine.

[10]  Paul Landais,et al.  Meta-regression detected associations between heterogeneous treatment effects and study-level, but not patient-level, factors. , 2004, Journal of clinical epidemiology.

[11]  R. Boellaard,et al.  Repeatability of 18F-FDG PET in a Multicenter Phase I Study of Patients with Advanced Gastrointestinal Malignancies , 2009, Journal of Nuclear Medicine.

[12]  N. Laird,et al.  Meta-analysis in clinical trials. , 1986, Controlled clinical trials.

[13]  L. Washington,et al.  Inherent variability of CT lung nodule measurements in vivo using semiautomated volumetric measurements. , 2006, AJR. American journal of roentgenology.

[14]  S. R. Searle,et al.  Restricted Maximum Likelihood (REML) Estimation of Variance Components in the Mixed Model , 1976 .

[15]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[16]  A Whitehead,et al.  Meta‐analysis of continuous outcome data from individual patients , 2001, Statistics in medicine.

[17]  H. Barnhart,et al.  An Overview on Assessing Agreement with Continuous Measurements , 2007, Journal of biopharmaceutical statistics.

[18]  S. Sharp,et al.  Explaining heterogeneity in meta-analysis: a comparison of methods. , 1997, Statistics in medicine.

[19]  M Heller,et al.  Interobserver-variability of lung nodule volumetry considering different segmentation algorithms and observer training levels. , 2007, European journal of radiology.

[20]  Kaisra Esmail,et al.  Association of study quality with completeness of reporting: have completeness of reporting and quality of systematic reviews and meta-analyses in major radiology journals changed since publication of the PRISMA statement? , 2013, Radiology.

[21]  S. Normand,et al.  TUTORIAL IN BIOSTATISTICS META-ANALYSIS : FORMULATING , EVALUATING , COMBINING , AND REPORTING , 1999 .

[22]  R L Wahl,et al.  Lung cancer: reproducibility of quantitative measurements for evaluating 2-[F-18]-fluoro-2-deoxy-D-glucose uptake at PET. , 1995, Radiology.

[23]  P C Lambert,et al.  A comparison of summary patient-level covariates in meta-regression with individual patient data meta-analysis. , 2002, Journal of clinical epidemiology.

[24]  M. Cowles Statistical Computing: An Introduction to Data Analysis using SPlus , 2004 .

[25]  B. Nan,et al.  Pulmonary nodule volumetric measurement variability as a function of CT slice thickness and nodule morphology. , 2007, AJR. American journal of roentgenology.

[26]  Kyle J Myers,et al.  Quantitative imaging biomarkers: A review of statistical methods for computer algorithm comparisons , 2014, Statistical methods in medical research.

[27]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Iva Petkovska,et al.  The effect of lung volume on nodule size on CT. , 2007, Academic radiology.

[29]  J. Ioannidis,et al.  The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. , 2009, Annals of Internal Medicine.

[30]  W. Richter Imaging biomarkers as surrogate endpoints for drug development , 2006, European Journal of Nuclear Medicine and Molecular Imaging.

[31]  G. Molenberghs Applied Longitudinal Analysis , 2005 .

[32]  Kerrie Mengersen,et al.  Multivariate meta‐analysis , 2003, Statistics in medicine.

[33]  W. G. Cochran Problems arising in the analysis of a series of similar experiments , 1937 .

[34]  Julian P. T. Higgins,et al.  Meta-Regression in Stata , 2008 .

[35]  Bruce D Cheson,et al.  Progress and Promise of FDG-PET Imaging for Cancer Patient Management and Oncologic Drug Development , 2005, Clinical Cancer Research.

[36]  J. Ioannidis,et al.  The PRISMA Statement for Reporting Systematic Reviews and Meta-Analyses of Studies That Evaluate Health Care Interventions: Explanation and Elaboration , 2009, Annals of Internal Medicine [serial online].

[37]  Harry J de Koning,et al.  Effect of nodule characteristics on variability of semiautomated volume measurements in pulmonary nodules detected in a lung cancer screening program. , 2008, Radiology.

[38]  P. Marsden,et al.  Correlation between Ki-67 immunohistochemistry and 18F-fluorothymidine uptake in patients with cancer: A systematic review and meta-analysis. , 2012, European journal of cancer.

[39]  Johan Nuyts,et al.  Methods to monitor response to chemotherapy in non-small cell lung cancer with 18F-FDG PET. , 2002, Journal of nuclear medicine : official publication, Society of Nuclear Medicine.

[40]  S L Normand,et al.  Meta-analysis: formulating, evaluating, combining, and reporting. , 1999, Statistics in medicine.

[41]  Paul S Albert,et al.  Assessing surrogates as trial endpoints using mixed models , 2005, Statistics in medicine.

[42]  Theo Stijnen,et al.  Advanced methods in meta‐analysis: multivariate approach and meta‐regression , 2002, Statistics in medicine.

[43]  R. Boellaard,et al.  Repeatability of 18F-FDG Uptake Measurements in Tumors: A Metaanalysis , 2012, The Journal of Nuclear Medicine.

[44]  Claude Nahmias,et al.  Reproducibility of Standardized Uptake Value Measurements Determined by 18F-FDG PET in Malignant Tumors , 2008, Journal of Nuclear Medicine.

[45]  Lisa M McShane,et al.  Publication of tumor marker research results: the necessity for complete and transparent reporting. , 2012, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[46]  B. Cheson,et al.  The role of FDG-PET scans in patients with lymphoma. , 2007, Blood.

[47]  D J Spiegelhalter,et al.  Flexible random‐effects models using Bayesian semi‐parametric models: applications to institutional comparisons , 2007, Statistics in medicine.

[48]  Mathias Prokop,et al.  Pulmonary nodules: Interscan variability of semiautomated volume measurements with multisection CT-- influence of inspiration level, nodule size, and segmentation performance. , 2007, Radiology.

[49]  Julian P T Higgins,et al.  Controlling the risk of spurious findings from meta‐regression , 2004, Statistics in medicine.

[50]  D. Altman,et al.  STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT , 1986, The Lancet.

[51]  Thomas Beyer,et al.  FDG-PET/CT in re-staging of patients with lymphoma , 2004, European Journal of Nuclear Medicine and Molecular Imaging.

[52]  R. Fisher,et al.  Prognostic significance of [18F]-misonidazole positron emission tomography-detected tumor hypoxia in patients with advanced head and neck cancer randomly assigned to chemoradiation with or without tirapazamine: a substudy of Trans-Tasman Radiation Oncology Group Study 98.02. , 2006, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[53]  M Schwaiger,et al.  Reproducibility of metabolic measurements in malignant tumors using FDG PET. , 1999, Journal of nuclear medicine : official publication, Society of Nuclear Medicine.

[54]  T. Ferguson A Bayesian Analysis of Some Nonparametric Problems , 1973 .

[55]  J. Fleiss,et al.  Intraclass correlations: uses in assessing rater reliability. , 1979, Psychological bulletin.

[56]  D. Altman,et al.  Measuring inconsistency in meta-analyses , 2003, BMJ : British Medical Journal.

[57]  N. Laird Nonparametric Maximum Likelihood Estimation of a Mixing Distribution , 1978 .

[58]  Hannah R Rothstein,et al.  A basic introduction to fixed‐effect and random‐effects models for meta‐analysis , 2010, Research synthesis methods.

[59]  Mithat Gönen,et al.  Quantitative imaging biomarkers: A review of statistical methods for technical performance assessment , 2015, Statistical methods in medical research.

[60]  P. Bossuyt Informative reporting of systematic reviews in radiology. , 2013, Radiology.

[61]  C S Berkey,et al.  A random-effects regression model for meta-analysis. , 1995, Statistics in medicine.

[62]  Guido Knapp,et al.  Improved tests for a random effects meta‐regression with a single covariate , 2003, Statistics in medicine.

[63]  S. Thompson,et al.  Quantifying heterogeneity in a meta‐analysis , 2002, Statistics in medicine.

[64]  D. Zeng,et al.  On the relative efficiency of using summary statistics versus individual-level data in meta-analysis. , 2010, Biometrika.

[65]  Fernando Boada,et al.  Informatics methods to enable sharing of quantitative imaging research data. , 2012, Magnetic resonance imaging.

[66]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2010, International journal of surgery.

[67]  Bimal K. Sinha,et al.  Bootstrap Procedures for Testing Homogeneity Hypotheses , 2012 .

[68]  Evangelos Kontopantelis,et al.  Performance of statistical methods for meta-analysis when true study effects are non-normally distributed: A simulation study , 2012, Statistical methods in medical research.

[69]  L. Lin,et al.  Total deviation index for measuring individual agreement with applications in laboratory performance and bioequivalence. , 2000, Statistics in medicine.

[70]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[71]  I Olkin,et al.  Comparison of meta-analysis versus analysis of variance of individual patient data. , 1998, Biometrics.

[72]  H. Barnhart,et al.  The emerging science of quantitative imaging biomarkers terminology and definitions for scientific studies and regulatory submissions , 2015, Statistical methods in medical research.

[73]  I. Olkin,et al.  Improving the quality of reports of meta‐analyses of randomised controlled trials: the QUOROM statement , 2000, Revista espanola de salud publica.

[74]  B J Krause,et al.  (18)F-FDG-PET/CT in evaluating response to therapy in solid tumors: where we are and where we can go. , 2011, The quarterly journal of nuclear medicine and molecular imaging : official publication of the Italian Association of Nuclear Medicine (AIMN) [and] the International Association of Radiopharmacology (IAR), [and] Section of the Society of....