Inter-observer agreement in audit of quality of radiology requests and reports.

AIMS To assess the quality of the imaging procedure requests and radiologists' reports using an auditing tool, and to assess the agreement between different observers of the quality parameters. MATERIALS AND METHODS In an audit using a standardized scoring system, three observers reviewed request forms for 296 consecutive radiological examinations, and two observers reviewed a random sample of 150 of the corresponding radiologists' reports. We present descriptive statistics from the audit and pairwise inter-observer agreement, using the proportion agreement and kappa statistics. RESULTS The proportion of acceptable item scores (0 or +1) was above 70% for all items except the requesting physician's bleep or extension number, legibility of the physician's name, or details about previous investigations. For pairs of observers, the inter-observer agreement was generally high, however, the corresponding kappa values were consistently low with only 14 of 90 ratings >0.60 and 6 >0.80 on the requests/reports. For the quality of the clinical information, the appropriateness of the request, and the requested priority/timing of the investigation items, the mean percentage agreement ranged 67-76, and the corresponding kappa values ranged 0.08-0.24. CONCLUSION The inter-observer reliability of scores on the different items showed a high degree of agreement, although the kappa values were low, which is a well-known paradox. Current routines for requesting radiology examinations appeared satisfactory, although several problem areas were identified.

[1]  H. Kraemer Ramifications of a population model forκ as a coefficient of reliability , 1979 .

[2]  A. Feinstein,et al.  High agreement but low kappa: I. The problems of two paradoxes. , 1990, Journal of clinical epidemiology.

[3]  C. E. Kahn,et al.  Appropriateness of imaging procedure requests: do radiologists agree? , 1997, AJR. American journal of roentgenology.

[4]  D. Schuster,et al.  The malady of incomplete, inadequate, and inaccurate radiology requisition histories: a computerized treatment. , 1996, AJR. American journal of roentgenology.

[5]  R J Cook,et al.  Interobserver variation in interpreting chest radiographs for the diagnosis of acute respiratory distress syndrome. , 2000, American journal of respiratory and critical care medicine.

[6]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[7]  R M Allman,et al.  Effect of computerized requisition of radiology examinations on the transmission of clinical information. , 1997, Academic radiology.

[8]  V. Farewell,et al.  Conditional inference for subject-specific and marginal agreement: Two families of agreement measures† , 1995 .

[9]  H. Kundel,et al.  Measurement of observer agreement. , 2003, Radiology.

[10]  F. Hoehler Bias and prevalence effects on kappa viewed in terms of sensitivity and specificity. , 2000, Journal of clinical epidemiology.

[11]  P R Goddard,et al.  The influence of clinical information on the reporting of CT by radiologists. , 2000, The British journal of radiology.

[12]  C. Roberts,et al.  How good are case notes in the audit of radiological investigations? , 1990, Clinical radiology.

[13]  A. Feinstein,et al.  High agreement but low kappa: II. Resolving the paradoxes. , 1990, Journal of clinical epidemiology.

[14]  L. Ridley Guide to the radiology report. , 2002, Australasian radiology.

[15]  H. Brenner,et al.  Dependence of Weighted Kappa Coefficients on the Number of Categories , 1996, Epidemiology.