Evaluation of the reliability, usability, and applicability of AMSTAR, AMSTAR 2, and ROBIS: protocol for a descriptive analytic study

BackgroundSystematic reviews (SRs) of randomised controlled trials (RCTs) can provide the best evidence to inform decision-making, but their methodological and reporting quality varies. Tools exist to guide the critical appraisal of quality and risk of bias in SRs, but evaluations of their measurement properties are limited. We will investigate the interrater reliability (IRR), usability, and applicability of A MeaSurement Tool to Assess systematic Reviews (AMSTAR), AMSTAR 2, and Risk Of Bias In Systematic reviews (ROBIS) for SRs in the fields of biomedicine and public health.MethodsAn international team of researchers at three collaborating centres will undertake the study. We will use a random sample of 30 SRs of RCTs investigating therapeutic interventions indexed in MEDLINE in February 2014. Two reviewers at each centre will appraise the quality and risk of bias in each SR using AMSTAR, AMSTAR 2, and ROBIS. We will record the time to complete each assessment and for the two reviewers to reach consensus for each SR. We will extract the descriptive characteristics of each SR, the included studies, participants, interventions, and comparators. We will also extract the direction and strength of the results and conclusions for the primary outcome. We will summarise the descriptive characteristics of the SRs using means and standard deviations, or frequencies and proportions. To test for interrater reliability between reviewers and between the consensus agreements of reviewer pairs, we will use Gwet’s AC1 statistic. For comparability to previous evaluations, we will also calculate weighted Cohen’s kappa and Fleiss’ kappa statistics. To estimate usability, we will calculate the mean time to complete the appraisal and to reach consensus for each tool. To inform applications of the tools, we will test for statistical associations between quality scores and risk of bias judgments, and the results and conclusions of the SRs.DiscussionAppraising the methodological and reporting quality of SRs is necessary to determine the trustworthiness of their conclusions. Which tool may be most reliably applied and how the appraisals should be used is uncertain; the usability of newly developed tools is unknown. This investigation of common (AMSTAR) and newly developed (AMSTAR 2, ROBIS) tools will provide empiric data to inform their application, interpretation, and refinement.

[1]  B. Everitt,et al.  Statistical methods for rates and proportions , 1973 .

[2]  J. Fleiss,et al.  Statistical methods for rates and proportions , 1973 .

[3]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[4]  T. Chalmers,et al.  Meta-analyses of randomized controlled trials. , 1987, The New England journal of medicine.

[5]  A. Feinstein,et al.  High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.

[6]  D. Moher,et al.  Statistical power, sample size, and their reporting in randomized controlled trials. , 1994, JAMA.

[7]  S. Golder,et al.  The effectiveness and cost-effectiveness of prophylactic removal of wisdom teeth. , 2000, Health technology assessment.

[8]  A J Sutton,et al.  Publication and related biases. , 2000, Health technology assessment.

[9]  David Moher,et al.  Development of AMSTAR: a measurement tool to assess the methodological quality of systematic reviews , 2007, BMC medical research methodology.

[10]  K. Gwet Computing inter-rater reliability and its variance in the presence of high agreement. , 2008, The British journal of mathematical and statistical psychology.

[11]  David Moher,et al.  Assessing the Quality of Reports of Systematic Reviews: The QUOROM Statement Compared to Other Tools , 2008 .

[12]  David Moher,et al.  Non-Cochrane vs. Cochrane reviews were twice as likely to have positive conclusion statements: cross-sectional study. , 2009, Journal of clinical epidemiology.

[13]  Jeremy Grimshaw,et al.  AMSTAR is a reliable and valid measurement tool to assess the methodological quality of systematic reviews. , 2009, Journal of clinical epidemiology.

[14]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2009, Journal of clinical epidemiology.

[15]  E. Tacconelli Systematic reviews: CRD's guidance for undertaking reviews in health care , 2010 .

[16]  C. Teng,et al.  Interpreting systematic reviews: are we ready to make our own conclusions? A cross-sectional study , 2011, BMC medicine.

[17]  J. Higgins Cochrane handbook for systematic reviews of interventions. Version 5.1.0 [updated March 2011]. The Cochrane Collaboration , 2011 .

[18]  Yuxia Wu,et al.  Reliability and External Validity of AMSTAR in Assessing Quality of TCM Systematic Reviews , 2012, Evidence-based complementary and alternative medicine : eCAM.

[19]  E. Mohammadi,et al.  Barriers and facilitators related to the implementation of a physiological track and trigger system: A systematic review of the qualitative evidence , 2017, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[20]  Denise Thomson,et al.  A Descriptive Analysis of Overviews of Reviews Published between 2000 and 2011 , 2012, PloS one.

[21]  Kehu Yang,et al.  Quality and transparency of overviews of systematic reviews , 2012, Journal of evidence-based medicine.

[22]  Dawid Pieper,et al.  Overviews of reviews often have limited rigor: a systematic review. , 2012, Journal of clinical epidemiology.

[23]  Lisa Hartling,et al.  Testing the risk of bias tool showed low reliability between individual reviewers and across consensus assessments of reviewer pairs. , 2013, Journal of clinical epidemiology.

[24]  J. Higgins,et al.  Cochrane Handbook for Systematic Reviews of Interventions, Version 5.1.0. The Cochrane Collaboration , 2013 .

[25]  B. Prediger,et al.  Systematic review found AMSTAR, but not R(evised)-AMSTAR, to have good measurement properties. , 2015, Journal of clinical epidemiology.

[26]  P. Shekelle,et al.  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation , 2015, BMJ : British Medical Journal.

[27]  P. Shekelle,et al.  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015 statement , 2015, Systematic Reviews.

[28]  David Moher,et al.  Epidemiology and Reporting Characteristics of Systematic Reviews of Biomedical Research: A Cross-Sectional Study , 2016, PLoS medicine.

[29]  Lisa Hartling,et al.  What guidance is available for researchers conducting overviews of reviews of healthcare interventions? A scoping review and qualitative metasummary , 2016, Systematic Reviews.

[30]  Rachel Churchill,et al.  ROBIS: A new tool to assess risk of bias in systematic reviews was developed , 2016, Journal of clinical epidemiology.

[31]  P. Shekelle,et al.  Preferred reporting items for systematic review and meta-analysis protocols (PRISMA-P) 2015: elaboration and explanation , 2016, British Medical Journal.

[32]  N. Gogtay,et al.  Measures of Association. , 2016, The Journal of the Association of Physicians of India.

[33]  Philippe Ravaud,et al.  Wasted research when systematic reviews fail to provide a complete and up-to-date evidence synthesis: the example of lung cancer , 2016, BMC Medicine.

[34]  R. Fernandes,et al.  Evaluation of AMSTAR to assess the methodological quality of systematic reviews in overviews of reviews of healthcare interventions , 2017, BMC Medical Research Methodology.

[35]  D. Pieper,et al.  The risk of bias in systematic reviews tool showed fair reliability and good construct validity. , 2017, Journal of clinical epidemiology.

[36]  D. Pieper,et al.  Inter-rater reliability of AMSTAR is dependent on the pair of reviewers , 2017, BMC Medical Research Methodology.

[37]  Paul Montgomery,et al.  Risk of bias in overviews of reviews: a scoping review of methodological guidance and four‐item checklist , 2017, Research synthesis methods.

[38]  R. Churchill,et al.  An overview of systematic reviews of complementary and alternative therapies for fibromyalgia using both AMSTAR and ROBIS as quality assessment tools , 2017, Systematic Reviews.

[39]  P. Tugwell,et al.  AMSTAR 2: a critical appraisal tool for systematic reviews that include randomised or non-randomised studies of healthcare interventions, or both , 2017, British Medical Journal.

[40]  M. Capobussi,et al.  Quality assessment versus risk of bias in systematic reviews: AMSTAR and ROBIS had similar reliability but differed in their construct and applicability. , 2018, Journal of clinical epidemiology.