Standardized Reporting of Clinical Practice Guidelines: A Proposal from the Conference on Guideline Standardization

Evidence-based clinical practice guidelines can reduce the delivery of inappropriate care and support the introduction of new knowledge into clinical practice (1-3). In many cases, guidelines encapsulate the most current knowledge about best practices. Rigorously developed guidelines can translate complicated research findings into actionable recommendations for clinical care (4). Over the past decade, a plethora of guidelines has been created and published by a multitude of organizations at substantial cost. Despite the enormous energies invested in guideline authoring, the quality of individual guidelines varies considerably. In its landmark report, the Institute of Medicine (IOM) defined 8 desirable attributes of clinical practice guidelines: validity, reliability and reproducibility, clinical applicability, clinical flexibility, clarity, documentation, development by a multidisciplinary process, and plans for review (5). However, critical information that would attest to validity or would document fulfillment of the other IOM criteria is regularly absent from published guidelines. In an evaluation of 279 guidelines developed by U.S. medical specialty societies, Shaneyfelt and colleagues (6) found that guidelines published in the peer-reviewed medical literature do not adhere to established methodologic standards. Likewise, Grilli and colleagues (7) found that of 431 guidelines produced by specialty societies, 82% did not apply explicit criteria to grade the scientific evidence that supported their recommendations, 87% did not report whether a systematic literature search was performed, and 67% did not describe the type of professionals involved in guideline development. Systematic reviews of guidelines for drug therapy (8), management of depression (9), and osteoporosis (10) have confirmed marked variation in quality. Both nonadherence to methodologic standards and failure to document development activities contribute to this variation. We convened the Conference on Guideline Standardization (COGS) to define a standard for guideline reporting that would promote guideline quality and facilitate implementation. The proposed standard provides a checklist of components necessary for evaluation of validity and usability. The checklist is intended to minimize the quality defects that arise from failure to include essential information and to promote development of recommendation statements that are more easily implemented. In contrast to other instruments that have been developed for post hoc evaluation of guideline quality, the COGS checklist is intended to be used prospectively by developers to improve their product by improving documentation. The COGS panel used a systematic and rigorous process to define content of the proposed standard and to achieve consensus. The COGS panel also included a wide variety of perspectives, deliberately bringing together representatives from medical specialty societies, government agencies, and private groups that develop guidelines; journal editors and the National Guideline Clearinghouse (NGC), which disseminate guidelines; guideline implementers, including managed care representatives and informaticians; and academicians. Methods We actively sought people with diverse backgrounds from wide-ranging geographic areas to participate in the meeting. Selection criteria for participants were 1) activity in a wide variety of guidelines initiatives, 2) recognition as leaders in their field, and 3) willingness to collaborate. To maximize interaction, the number of participants in the Conference was limited. We set as a task for the panelists the specification and definition of a set of necessary guideline components that should be considered for reporting in all evidence-based practice guidelines. We defined necessary items as those that establish the validity of the guideline recommendations or facilitate practical application of the recommendations. We noted that many additional items might be considered appropriate components in guidelines, but sought to define a minimal set of essential elements. We assembled a list of candidate guideline components from the IOM Provisional Instrument for Assessing Clinical Guidelines (5), the NGC (11), and the Guideline Elements Model (12), a hierarchical model of guideline content that was created from a systematic review of published guideline descriptions. These items were supplemented with items highlighted in the literature, for example, structured abstract (13) and conflict of interest (14). We applied a modified Delphi approach to help focus group discussion. This approach has been widely applied for evaluation of expert opinion on appropriateness (15) and medical necessity (16), for policy development (17), and for prioritization (18). The technique has been described in detail (19). In brief, after we secured agreement to participate in the COGS panel, we gave all participants a bibliography of resources regarding guideline quality and its appraisal, guideline implementation, and the modified Delphi consensus development approach. Panelists were asked to rate their agreement with the statement that [Item name] is a necessary component of practice guidelines on a 9-point scale. Rating an item with a 9 indicated strong agreement with the statement that this item was necessary. Rating an item with a 1 indicated strong disagreement with the statement and suggested that the item was absolutely unnecessary in a guideline. A rating of 5 indicated neutrality or indifference. We developed a password-protected Web site that panelists used to complete the first round of ratings online before the meeting. Online rating permitted accurate and efficient data capture and analysis. For each item, the median rating and the disagreement index (defined as the Interpercentile Range divided by the Interpercentile Range Adjusted for Symmetry) were calculated (19). The disagreement index, which can be calculated for panels of any size, describes the dispersion of ratings more effectively than the mean absolute deviation from the median. Index values greater than 1 indicate disagreement. We displayed summary statistics for each item on a form that was individualized for each panelist. The COGS meeting was held on 26 and 27 April 2002 in New Haven, Connecticut. In a colloquy facilitated by 2 of the authors, each candidate item was discussed to ensure that all participants agreed on its definition and potential contribution to the COGS checklist and to highlight empirical evidence of its value. When appropriate, additional items were added to the list. The group determined that in the second round of ratings, it would be valuable to rate each item's necessity on 2 subscales: necessity to establish validity and necessity for practical application. The participants then rated each item on these 2 dimensions. Analysis We tallied the median score and the disagreement index for each item. We retained items with median scores of 7 or higher and disagreement indexes less than 1.0 on either scale as necessary guideline components on the checklist. To interpret necessity, the 9-point scale is divided into 3 ranges: items scoring 1 to 3 are considered unnecessary, items scoring 4 to 6 are neutral, and items scoring 7 to 9 are considered necessary (20). In this study, the threshold of 7 was chosen because it represented the lowest rating at which the participants indicated that an item was necessary for inclusion in guidelines. Pilot Review To field test the proposed checklist, we surveyed organizations that were active in guideline development. We identified all organizations that met the NGC Web site criteria for guideline display on 12 July 2002. From that list, we selected organizations that had developed 10 or more guidelines displayed on the NGC Web site. We excluded 1) organizations that participated in development of the COGS checklist [because of potential bias] and 2) government agencies and organizations based outside the United States (for logistic reasons). A draft COGS checklist and a brief survey were sent to the people identified as being responsible for guideline development at each eligible organization. Results All 23 panelists submitted first- and second-round ballots. During the discussion, participants suggested consideration of 10 new items in the second round of balloting and refined definitions of several items. Thirty-six discrete items were considered necessary to establish guideline validity; they received ratings of 7 or greater and had disagreement indexes of less than 1. Twenty-four items were considered necessary for practical application of the guideline, each with a disagreement index less than 1. Several items were rated necessary on both dimensions. Overall, 44 discrete items were considered necessary. Closely related items were then consolidated into 18 topics to create the COGS checklist for reporting clinical guidelines (Table). Appendix Tables 1 and 2 present a complete listing of all items rated and their scores. Table. The COGS Checklist for Reporting Clinical Practice Guidelines Appendix Table 1. Items Ranked Necessary for Guideline Validity, Median Ratings, Distribution of Ratings by Tertiles, and Disagreement Index Appendix Table 2. Items Ranked Necessary for Guideline Usability, Median Ratings, Distribution of Ratings by Tertiles, and Disagreement Index Twenty-two organizations met eligibility criteria for evaluation of the draft checklist, and all completed the survey (100% response rate). Sixteen organizations (73%) responded that they believed the checklist would be helpful for creating more comprehensive practice guidelines, and an additional 2 organizations (9%) responded that it might be helpful. Nineteen respondents (86%) indicated that documenting the proposed items would fit within their organizations' guideline development methods. Fifteen (68%) stated that they would use the proposed checklist in guideline developme

[1]  F. McAlister,et al.  What is the quality of drug therapy clinical practice guidelines in Canada? , 2001, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.

[2]  D. Bergman,et al.  Clinical practice guidelines in pediatric and newborn medicine: implications for their use in practice. , 1997, Pediatrics.

[3]  J. Grimshaw,et al.  Effect of clinical guidelines on medical practice: a systematic review of rigorous evaluations , 1993, The Lancet.

[4]  A. Detsky,et al.  Relationships between authors of clinical practice guidelines and the pharmaceutical industry. , 2002, JAMA.

[5]  R. Brook,et al.  Appropriateness of the use of cardiovascular procedures: a method and results of this application. , 1993, Schweizerische medizinische Wochenschrift.

[6]  M. Mayo-Smith,et al.  Are guidelines following guidelines? The methodological quality of clinical practice guidelines in the peer-reviewed medical literature. , 1999, JAMA.

[7]  J. Marc Overhage,et al.  Case Report: Computerizing Guidelines to Improve Care and Patient Outcomes: The Example of Heart Failure , 1995, J. Am. Medical Informatics Assoc..

[8]  J. Grimshaw,et al.  Potential benefits, limitations, and harms of clinical guidelines , 1999, BMJ.

[9]  A. Haines,et al.  Implementing findings of research , 1994, BMJ.

[10]  I. Graham,et al.  Systematic assessment of the quality of osteoporosis guidelines , 2002, BMC musculoskeletal disorders.

[11]  K. Kahn,et al.  Physician ratings of appropriate indications for six medical and surgical procedures. , 1986, American journal of public health.

[12]  L. Leape,et al.  Measuring the Necessity of Medical Procedures , 1994, Medical care.

[13]  James P. Kahan,et al.  Coronary Artery Bypass Graft: A Literature Review and Ratings of Appropriateness and Necessity , 1991 .

[14]  I D Graham,et al.  A COMPARISON OF CLINICAL PRACTICE GUIDELINE APPRAISAL INSTRUMENTS , 2000, International Journal of Technology Assessment in Health Care.

[15]  D. Moher,et al.  The Revised CONSORT Statement for Reporting Randomized Trials: Explanation and Elaboration , 2001, Annals of Internal Medicine.

[16]  M. Field,et al.  Guidelines for Clinical Practice: From Development to Use , 1992 .

[17]  J. Grimshaw,et al.  Development and application of a generic methodology to assess the quality of clinical guidelines. , 1999, International journal for quality in health care : journal of the International Society for Quality in Health Care.

[18]  P. Littlejohns,et al.  Appraising clinical practice guidelines in England and Wales: the development of a methodologic framework and its application to policy. , 1999, The Joint Commission journal on quality improvement.

[19]  D. Moher,et al.  Use of the CONSORT statement and quality of reports of randomized trials: a comparative before-and-after evaluation. , 2001, JAMA.

[20]  D Moher,et al.  The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomized trials. , 2001, Annals of internal medicine.

[21]  H. Rubin,et al.  More Informative Abstracts of Articles Describing Clinical Practice Guidelines , 1993, Annals of Internal Medicine.

[22]  B. Burnand,et al.  The RAND/UCLA Appropriateness Method User's Manual , 2001 .

[23]  Richard N. Shiffman,et al.  Model Formulation: GEM: A Proposal for a More Comprehensive Guideline Document Model Using XML , 2000, J. Am. Medical Informatics Assoc..

[24]  Alessandro Liberati,et al.  Practice guidelines developed by specialty societies: the need for a critical appraisal , 2000, The Lancet.

[25]  M Egger,et al.  Value of flow diagrams in reports of randomized controlled trials. , 2001, JAMA.

[26]  J. Grimshaw,et al.  The quantity and quality of clinical practice guidelines for the management of depression in primary care in the UK. , 1999, The British journal of general practice : the journal of the Royal College of General Practitioners.

[27]  J. Lavis,et al.  Appropriateness in health care delivery: definitions, measurement and policy implications. , 1996, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.