Improving the accuracy of mammography: volume and outcome relationships.

BACKGROUND Countries with centralized, high-volume mammography screening programs, such as the U.K. and Sweden, emphasize high specificity (low percentage of false positives) and high sensitivity (high percentage of true positives). By contrast, the United States does not have centralized, high-volume screening programs, emphasizes high sensitivity, and has lower average specificity. We investigated whether high sensitivity can be achieved in the context of high specificity and whether the number of mammograms read per radiologist (reader volume) drives both sensitivity and specificity. METHODS The U.K.'s National Health Service Breast Screening Programme uses the PERFORMS 2 test as a teaching and assessment tool for radiologists. The same 60-film PERFORMS 2 test was given to 194 high-volume U.K. radiologists and to 60 U.S. radiologists, who were assigned to low-, medium-, or high-volume groups on the basis of the number of mammograms read per month. The standard binormal receiver-operating characteristic (ROC) model was fitted to the data of individual readers. Detection accuracy was measured by the sensitivity at specificity = 0.90, and differences among sensitivities were determined by analysis of variance. RESULTS The average sensitivity at specificity = 0.90 was 0.785 for U.K. radiologists, 0.756 for high-volume U.S. radiologists, 0.702 for medium-volume U.S. radiologists, and 0.648 for low-volume U.S. radiologists. At this specificity, low-volume U.S. radiologists had statistically significantly lower sensitivity than either high-volume U.S. radiologists or U.K. radiologists, and medium-volume U.S. radiologists had statistically significantly lower sensitivity than U.K. radiologists (P<.001, for all comparisons). CONCLUSIONS Reader volume is an important determinant of mammogram sensitivity and specificity. High sensitivity (high cancer detection rate) can be achieved with high specificity (low false-positive rate) in high-volume centers. This study suggests that there is great potential for optimizing mammography screening.

[1]  S. Wall,et al.  Breast cancer screening with mammography: overview of Swedish randomised trials , 1993, The Lancet.

[2]  R. Barth Reducing the Risk , 1995 .

[3]  K S Berbaum,et al.  Measuring observer performance by ROC analysis. Indications and complications. , 1989, Investigative radiology.

[4]  M. Brown,et al.  Current practice of screening mammography in the United States: data from the National Survey of Mammography Facilities. , 1994, Radiology.

[5]  W. R. Buckland,et al.  Contributions to Probability and Statistics , 1960 .

[6]  Alastair G. Gale,et al.  Breast cancer screening: comparison of radiologists' performance in a self-assessment scheme and in actual breast screening , 1999, Medical Imaging.

[7]  Morton B. Brown,et al.  The Small Sample Behavior of Some Statistics Which Test the Equality of Several Means , 1974 .

[8]  D. Cyrlak,et al.  Induced costs of low-cost screening mammography. , 1988, Radiology.

[9]  N. Powe,et al.  The association between hospital volume and survival after acute myocardial infarction in elderly patients. , 1999, The New England journal of medicine.

[10]  L. Rutqvist,et al.  Followup after 11 years – update of mortality results in the Stockholm mammographic screening trial , 1997, Breast Cancer Research and Treatment.

[11]  D. Ikeda,et al.  Interval carcinomas in the Malmö Mammographic Screening Trial: radiographic appearance and prognostic considerations. , 1992, AJR. American journal of roentgenology.

[12]  H. Levene Robust tests for equality of variances , 1961 .

[13]  Peter C Gøtzsche,et al.  Cochrane review on screening for breast cancer with mammography , 2001, The Lancet.

[14]  J. Elmore,et al.  Variability in radiologists' interpretations of mammograms. , 1994, The New England journal of medicine.

[15]  K S Berbaum,et al.  A contaminated binormal model for ROC data: Part II. A formal model. , 2000, Academic radiology.

[16]  Kevin S. Berbaum,et al.  A contaminated binormal model for ROC data , 2000 .

[17]  R. J. Brenner Surgical malignancy rate in women who have undergone needle core biopsy. , 1996, Radiology.

[18]  J. Roberts,et al.  The effect of the volume of procedures at transplantation centers on mortality after liver transplantation. , 1999, The New England journal of medicine.

[19]  E L Hannan,et al.  Coronary angioplasty volume-outcome relationships for hospitals and cardiologists. , 1997, JAMA.

[20]  Regina E. Herzlinger,et al.  Market Driven Health Care: Who Wins, Who Loses in the Transformation of America's Largest Service Industry , 1996 .

[21]  K. Armelius,et al.  Women with false positive screening mammograms: how do they cope? , 1999, Journal of medical screening.

[22]  I. Gram,et al.  Quality of life following a false positive mammogram. , 1990, British Journal of Cancer.

[23]  J. Lally The politics of mammography. , 1991, Delaware medical journal.

[24]  H. Kundel Medical Imaging 1996: Image Perception , 1996 .

[25]  L. Tabár,et al.  The Swedish two county trial of mammographic screening for breast cancer: recent results and calculation of benefit. , 1989, Journal of epidemiology and community health.

[26]  Alastair G. Gale,et al.  Mammographic training sets for improving breast cancer detection , 1996, Medical Imaging.

[27]  L W Bassett,et al.  Mammography and early breast cancer detection. , 1991, American family physician.

[28]  K. Lillemoe,et al.  Importance of hospital volume in the overall management of pancreatic cancer. , 1998, Annals of surgery.

[29]  Mireille J. M. Broeders,et al.  Breast cancer screening programmes in 22 countries: current policies, administration and guidelines , 1998 .

[30]  D. Dorfman,et al.  Maximum-likelihood estimation of parameters of signal-detection theory and determination of confidence intervals—Rating-method data , 1969 .

[31]  Butler Dl,et al.  Mammography and early breast cancer detection. , 1991 .

[32]  L. Skoog,et al.  The general mammography screening program in Stockholm. Organisation and first-round results. , 1994, Acta oncologica.

[33]  I. Ellis,et al.  Early experience in breast cancer screening: emphasis on development of protocols for triple assessment , 1993 .

[34]  J. Tielsch,et al.  The Effects of Regionalization on Cost and Outcome for One General High‐Risk Surgical Procedure , 1995, Annals of surgery.

[35]  M. Moskowitz Guidelines for screening for breast cancer. Is a revision in order? , 1992, Radiologic clinics of North America.

[36]  K S Berbaum,et al.  Degeneracy and discrete receiver operating characteristic rating data. , 1995, Academic radiology.

[37]  C. Beam,et al.  Variability in the interpretation of screening mammograms by US radiologists. Findings from a national sample. , 1996, Archives of internal medicine.

[38]  William G. Cochran,et al.  Experimental Designs, 2nd Edition , 1950 .

[39]  A. Enthoven,et al.  Should operations be regionalized? The empirical relation between surgical volume and mortality. , 1980, The New England journal of medicine.

[40]  K. Kerlikowske,et al.  Cost-Effectiveness of Extending Screening Mammography Guidelines To Include Women 40 to 49 Years of Age , 1997, Annals of Internal Medicine.

[41]  J. Leung,et al.  Percutaneous imaging-guided core breast biopsy: 5 years' experience in a community hospital. , 2001, AJR. American journal of roentgenology.

[42]  E A Sickles,et al.  Standardized abnormal interpretation and cancer detection ratios to assess reading volume and reader performance in a breast screening program. , 2000, Radiology.

[43]  E. Burnside,et al.  The impact of alternative practices on the cost and quality of mammographic screening in the United States. , 2001, Clinical breast cancer.

[44]  D. Kopans,et al.  Breast cancer survival among women under age 50: is mammography detrimental? , 1992, The Lancet.

[45]  Alastair G. Gale,et al.  Breast screening: visual search and observer performance , 1994, Medical Imaging.

[46]  G. W. Eklund,et al.  Percutaneous large-core breast biopsy: a multi-institutional study. , 1994, Radiology.

[47]  J A Hanley Alternative approaches to receiver operating characteristic analyses. , 1988, Radiology.

[48]  Alastair G. Gale,et al.  Mammographic screening: radiological performance as a precursor to image processing , 1993, Electronic Imaging.

[49]  M. Choti,et al.  Complex gastrointestinal surgery: impact of provider experience on clinical and economic outcomes. , 1999, Journal of the American College of Surgeons.

[50]  C. Floyd,et al.  Breast imaging reporting and data system standardized mammography lexicon: observer variability in lesion description. , 1996, AJR. American journal of roentgenology.

[51]  J. Elmore,et al.  Ten-year risk of false positive screening mammograms and clinical breast examinations. , 1998, The New England journal of medicine.

[52]  E. Thurfjell,et al.  Benefit of independent double reading in a population-based mammography screening program. , 1994, Radiology.

[53]  K. Kerlikowske,et al.  Positive predictive value of screening mammography by age and family history of breast cancer. , 1993, JAMA.