Inter-observer variability in mammography screening and effect of type and number of readers on screening outcome

We prospectively determined the variability in radiologists' interpretation of screening mammograms and assessed the influence of type and number of readers on screening outcome. Twenty-one screening mammography radiographers and eight screening radiologists participated. A total of 106 093 screening mammograms were double-read by two radiographers and, in turn, by two radiologists. Initially, radiologists were blinded to the referral opinion of the radiographers. A woman was referred if she was considered positive at radiologist double-reading with consensus interpretation or referred after radiologist review of positive cases at radiographer double-reading. During 2-year follow-up, clinical data, breast imaging reports, biopsy results and breast surgery reports were collected of all women with a positive screening result from any reader. Single radiologist reading (I) resulted in a mean cancer detection rate of 4.64 per 1000 screens (95% confidence intervals (CI)=4.23–5.05) with individual variations from 3.44 (95% CI=2.30–4.58) to 5.04 (95% CI=3.81–6.27), and a sensitivity of 63.9% (95% CI=60.5–67.3), ranging from 51.5% (95% CI=39.6–63.3) to 75.0% (95% CI=65.3–84.7). Sensitivity at non-blinded, radiologist double-reading (II), radiologist double-reading followed by radiologist review of positive cases at radiographer double-reading (III), triple reading by one radiologist and two radiographers with referral of all positive readings (IV) and quadruple reading by two radiologists and two radiographers with referral of all positive readings (V) were as follows: 68.6% (95% CI=65.3–71.9) (II); 73.2% (95% CI=70.1–76.4) (III); 75.2% (95% CI=72.1–78.2) (IV), and 76.9% (95% CI=73.9–79.9) (V). We conclude that screener performance significantly varied at single-reading. Double-reading increased sensitivity by a relative 7.3%. When there is a shortage of screening radiologists, triple reading by one radiologist and two radiographers may replace radiologist double-reading.

[1]  R. Warren,et al.  Mammography screening: an incremental cost effectiveness analysis of double versus single reading of mammograms , 1996, BMJ.

[2]  J. Elmore,et al.  Screening mammograms by community radiologists: variability in false-positive rates. , 2002, Journal of the National Cancer Institute.

[3]  Angela Mariotto,et al.  Cancer Intervention and Surveillance Modeling Network (CISNET) , 2001, Journal of Investigative Medicine.

[4]  C. D'Orsi,et al.  Diagnostic Performance of Digital versus Film Mammography for Breast-Cancer Screening , 2006 .

[5]  J. Otten,et al.  Cost-effectiveness of different reading and referral strategies in mammography screening in the Netherlands , 2007, Breast Cancer Research and Treatment.

[6]  Karla Kerlikowske,et al.  Comparison of screening mammography in the United States and the United kingdom. , 2003, JAMA.

[7]  H. D. de Koning,et al.  Utilization and cost of diagnostic imaging and biopsies following positive screening mammography in the southern breast cancer screening region of the Netherlands, 2000–2005 , 2008, European Radiology.

[8]  J. Elmore,et al.  Variability in radiologists' interpretations of mammograms. , 1994, The New England journal of medicine.

[9]  M J Schell,et al.  Association of recall rates with sensitivity and positive predictive values of screening mammography. , 2001, AJR. American journal of roentgenology.

[10]  T. Cupples Single Reading with Computer-Aided Detection for Screening Mammography , 2009 .

[11]  H. D. de Koning,et al.  Mammography screening in the Netherlands: delay in the diagnosis of breast cancer after breast cancer screening , 2004, British Journal of Cancer.

[12]  H. D. de Koning,et al.  Nationwide breast cancer screening programme fully implemented in The Netherlands. , 2001, Breast.

[13]  D. Berry,et al.  Effect of screening and adjuvant therapy on mortality from breast cancer , 2005 .

[14]  Susan M Astley,et al.  Single reading with computer-aided detection and double reading of screening mammograms in the United Kingdom National Breast Screening Program. , 2006, Radiology.

[15]  L. Liberman,et al.  Breast imaging reporting and data system (BI-RADS). , 2002, Radiologic clinics of North America.

[16]  C. D'Orsi,et al.  Current realities of delivering mammography services in the community: do challenges with staffing and scheduling exist? , 2005, Radiology.

[17]  L. Tabár,et al.  Mammography service screening and mortality in breast cancer patients: 20-year follow-up before and after introduction of screening , 2003, The Lancet.

[18]  Harry J de Koning,et al.  Independent double reading of screening mammograms in The Netherlands: effect of arbitration following reader disagreements. , 2004, Radiology.

[19]  J. Hendriks,et al.  Initiation of population-based mammography screening in Dutch municipalities and effect on breast-cancer mortality: a systematic review , 2003, The Lancet.

[20]  J M Tonita,et al.  Medical radiologic technologist review: effects on a population-based breast cancer screening program. , 1999, Radiology.

[21]  C. Beam,et al.  Effect of human variability on independent double reading in screening mammography. , 1996, Academic radiology.

[22]  Harry J de Koning,et al.  Additional double reading of screening mammograms by radiologic technologists: impact on screening performance parameters. , 2007, Journal of the National Cancer Institute.

[23]  C. Rutter,et al.  Assessing mammographers' accuracy. A comparison of clinical and test performance. , 2000, Journal of clinical epidemiology.

[24]  Nico Karssemeijer,et al.  Effect of recall rate on earlier screen detection of breast cancers based on the Dutch performance indicators. , 2005, Journal of the National Cancer Institute.

[25]  D. Berry,et al.  Effect of screening and adjuvant therapy on mortality from breast cancer. , 2006, The New England journal of medicine.

[26]  Roland Holland,et al.  The current detectability of breast cancer in a mammographic screening program. A review of the previous mammograms of interval and screen‐detected cancers , 1993, Cancer.

[27]  I sabel Mortara,et al.  International Union against Cancer , 1938, Nature.

[28]  Michael J Schell,et al.  Recall and detection rates in screening mammography , 2004, Cancer.

[29]  S. Ciatto,et al.  Second reading of screening mammograms increases cancer detection and recall rates. Results in the Florence screening programme , 2005, Journal of medical screening.

[30]  Helen C. Cowley,et al.  Improving the accuracy of mammography: volume and outcome relationships. , 2002, Journal of the National Cancer Institute.

[31]  R Pauli,et al.  Comparison of Radiographer/Radiologist Double Film Reading with Single Reading in Breast Cancer Screening , 1996, Journal of medical screening.

[32]  J. Elmore,et al.  Accuracy of screening mammography using single versus independent double interpretation. , 2000, AJR. American journal of roentgenology.

[33]  Melanie Pinet,et al.  Increase in cancer detection and recall rates with independent double interpretation of screening mammography. , 2003, AJR. American journal of roentgenology.

[34]  K. Kerlikowske,et al.  Variability and accuracy in mammographic interpretation using the American College of Radiology Breast Imaging Reporting and Data System. , 1998, Journal of the National Cancer Institute.

[35]  L. Sobin,et al.  TNM Classification of Malignant Tumours , 1987, UICC International Union Against Cancer.

[36]  S. Astley,et al.  Single reading with computer-aided detection for screening mammography. , 2008, The New England journal of medicine.