Extending an evidence hierarchy to include topics other than treatment: revising the Australian 'levels of evidence'

BackgroundIn 1999 a four-level hierarchy of evidence was promoted by the National Health and Medical Research Council in Australia. The primary purpose of this hierarchy was to assist with clinical practice guideline development, although it was co-opted for use in systematic literature reviews and health technology assessments. In this hierarchy interventional study designs were ranked according to the likelihood that bias had been eliminated and thus it was not ideal to assess studies that addressed other types of clinical questions. This paper reports on the revision and extension of this evidence hierarchy to enable broader use within existing evidence assessment systems.MethodsA working party identified and assessed empirical evidence, and used a commissioned review of existing evidence assessment schema, to support decision-making regarding revision of the hierarchy. The aim was to retain the existing evidence levels I-IV but increase their relevance for assessing the quality of individual diagnostic accuracy, prognostic, aetiologic and screening studies. Comprehensive public consultation was undertaken and the revised hierarchy was piloted by individual health technology assessment agencies and clinical practice guideline developers. After two and a half years, the hierarchy was again revised and commenced a further 18 month pilot period.ResultsA suitable framework was identified upon which to model the revision. Consistency was maintained in the hierarchy of "levels of evidence" across all types of clinical questions; empirical evidence was used to support the relationship between study design and ranking in the hierarchy wherever possible; and systematic reviews of lower level studies were themselves ascribed a ranking. The impact of ethics on the hierarchy of study designs was acknowledged in the framework, along with a consideration of how harms should be assessed.ConclusionThe revised evidence hierarchy is now widely used and provides a common standard against which to initially judge the likelihood of bias in individual studies evaluating interventional, diagnostic accuracy, prognostic, aetiologic or screening topics. Detailed quality appraisal of these individual studies, as well as grading of the body of evidence to answer each clinical, research or policy question, can then be undertaken as required.

[1]  G. Guyatt,et al.  Systems for grading the quality of evidence and the strength of recommendations I: Critical appraisal of existing approaches The GRADE Working Group , 2004, BMC health services research.

[2]  J. Byles,et al.  Using socioeconomic evidence in clinical practice guidelines , 2003, BMJ : British Medical Journal.

[3]  E. Denny,et al.  Systematic reviews of qualitative evidence: What are the experiences of women with endometriosis? , 2006, Journal of obstetrics and gynaecology : the journal of the Institute of Obstetrics and Gynaecology.

[4]  A. Dhar,et al.  National Institute for Health and Clinical Excellence , 2005 .

[5]  Gordon H Guyatt,et al.  Systems for grading the quality of evidence and the strength of recommendations II: Pilot study of a new system , 2005, BMC health services research.

[6]  Catherine Pope,et al.  Moving Beyond Effectiveness in evidence Synthesis: methodological issues in the synthesis of diverse sources of evidence , 2006 .

[7]  P. Bossuyt,et al.  Empirical evidence of design-related bias in studies of diagnostic tests. , 1999, JAMA.

[8]  Steve Halligan,et al.  Systematic reviews of diagnostic tests in cancer: review of methods and reporting , 2006, BMJ : British Medical Journal.

[9]  A D Oxman,et al.  The unpredictability paradox: review of empirical comparisons of randomised and non-randomised clinical trials , 1998, BMJ.

[10]  G. Rubin,et al.  How to put the evidence into practice: implementation and dissemination strategies , 2000 .

[11]  R. Chadwick,et al.  The ethics of screening: is 'screeningitis' an incurable disease? , 1994, Journal of medical ethics.

[12]  Gordon H Guyatt,et al.  GrADe : what is “ quality of evidence ” and why is it important to clinicians ? rATING quALITY of evIDeNCe AND STreNGTH of reCommeNDATIoNS , 2022 .

[13]  Jonathan J Deeks,et al.  Systematic reviews in health care: Systematic reviews of evaluations of diagnostic and screening tests. , 2001, BMJ.

[14]  Rinaldo Bellomo,et al.  Evidence-based medicine: Classifying the evidence from clinical trials – the need to consider other dimensions , 2006, Critical care.

[15]  Jeanne Daly,et al.  A hierarchy of evidence for assessing qualitative health research. , 2007, Journal of clinical epidemiology.

[16]  Miguel A. Martínez-González,et al.  Critical Appraisal of Epidemiological Studies and Clinical Trials , 2007 .

[17]  M. Elwood,et al.  Critical Appraisal of Epidemiological Studies and Clinical Trials , 2007 .

[18]  D. Sackett,et al.  The architecture of diagnostic research , 2002, BMJ : British Medical Journal.

[19]  A. Horvath,et al.  Grading quality of evidence and strength of recommendations for diagnostic tests and strategies. , 2009, Clinical chemistry.

[20]  J. Popay,et al.  Methodological issues in the synthesis of diverse sources of evidence. , 2006 .

[21]  P. Bossuyt,et al.  BMC Medical Research Methodology , 2002 .

[22]  A. Hartz,et al.  A comparison of observational studies and randomized, controlled trials. , 2000, The New England journal of medicine.

[23]  C. Heneghan,et al.  Levels of Evidence , 2006 .

[24]  H Newman,et al.  The Scottish Intercollegiate Guidelines Network (SIGN) guideline for head and neck cancer: pointing in the right direction? , 2008, Clinical oncology (Royal College of Radiologists (Great Britain)).

[25]  van den Wim Heuvel,et al.  Distressed or relieved? Psychological side effects of breast cancer screening in The Netherlands. , 1997, Journal of epidemiology and community health.

[26]  S. Yusuf,et al.  Safety outcomes in meta-analyses of phase 2 vs phase 3 randomized trials: Intracranial hemorrhage in trials of bolus thrombolytic therapy. , 2001, JAMA.

[27]  G. Colditz,et al.  How to review the evidence: systematic identification and review of the scientific literature , 2000 .

[28]  N. Black,et al.  The feasibility of creating a checklist for the assessment of the methodological quality both of randomised and non-randomised studies of health care interventions. , 1998, Journal of epidemiology and community health.

[29]  Jennie Popay,et al.  Chapter 20: Qualitative research and Cochrane reviews , 2008 .

[30]  G. Colditz,et al.  How to use the evidence: assessment and application of scientific evidence , 1999 .

[31]  Paulley Jw PSYCHIATRIC INDICATIONS FOR TERMINATION OF PREGNANCY. , 1965 .

[32]  P. Haywood,et al.  How to compare the costs and benefits: evaluation of the economic evidence , 2001 .

[33]  N. Black Why we need observational studies to evaluate the effectiveness of health care , 1996, BMJ.

[34]  Roger M. Harbord,et al.  An empirical comparison of methods for meta-analysis of studies of diagnostic accuracy , 2005 .

[35]  R. Lilford,et al.  The Ethics of Placebo-controlled Trials: A Comparison of Inert and Active Placebo Controls , 2005, World Journal of Surgery.

[36]  George Davey Smith,et al.  Where now for meta-analysis? , 2002, International journal of epidemiology.

[37]  G. ter Riet,et al.  Systematic reviews of evaluations of diagnostic and screening tests , 2001, BMJ : British Medical Journal.

[38]  B. Cowling,et al.  Who receives, benefits from and is harmed by cervical and breast cancer screening among Hong Kong Chinese? , 2008, Journal of public health.

[39]  B. Jackson The dangers of false-positive and false-negative test results: false-positive results as a function of pretest probability. , 2008, Clinics in laboratory medicine.

[40]  J. Concato,et al.  Randomized, controlled trials, observational studies, and the hierarchy of research designs. , 2000, The New England journal of medicine.