Screening Mammography for Younger Women: Back to Basics

This issue of Annals adds fuel to the breast cancer screening debate. It contains the U.S. Preventive Services Task Force's recommendations on breast cancer screening, a summary of the supporting evidence, an editorial about dealing responsibly with conflicting evidence, and a report from the Canadian National Breast Screening Study (CNBSS). The CNBSS randomly assigned 50 304 women age 40 to 49 years to receive mammography and breast examinations five times annually or to receive usual care. The article in this issue by Miller and colleagues (1), the third report from this trial, uses data on breast cancer mortality 11 to 16 years after the first screening visit to address the question, Is a woman less likely to die of breast cancer if she starts screening while she is in her forties? Despite a large study population, careful study design and execution, and long follow-up, the Canadian study still doesn't tell us for certain whether screening helped, harmed, or had no effect. The point estimate of the cumulative rate ratio for death from all breast cancer among screened women is 0.97, indicating no effect of screening. The 95% confidence interval (CI) around this ratio sets limits on the potential benefit and harm. The lower bound (0.74) means that screening is unlikely to reduce breast cancer mortality by more than 26%, or one death per 10 000 women per year. The upper bound (1.27) means that screening is unlikely to increase breast cancer deaths by more than one death per 10 000 women per year. I suggest four options for dealing with the CNBSS in the light of all available evidence about breast cancer screening in younger women. We could ignore it and base our conclusions on the other studies. We could rely on it entirely and ignore the other studies. Another editorial in this issue (2) warns us about arbitrarily excluding evidence, so we should have very good reasons for adopting either of these approaches. The remaining choices are to combine the results of the CNBSS with those of the Swedish clinical trials (3) and accept the pooled point estimate of effect as the best estimate or to admit that we cannot draw a conclusion about this body of evidence. The Canadian study is a large, high-quality study (4-6). Randomization was done by woman rather than by population; screening occurred more frequently (annually) than in other studies; and the study groups were essentially identical in number and in age distribution, indicating successful randomization. Less than 0.2% of the patients dropped out of the study, and blinded reviewers assigned the cause of death. Furthermore, the follow-up is now sufficiently long to identify an effect of screening if it were truly present (7). The current report by Miller and colleagues addresses two previous criticisms of the CNBSS (1). First, despite successful randomization, the screened group had a worse prognosis before screening: All patients underwent breast examination before randomization, and 18 women in the screened group but only 5 in the usual care group had breast cancer and four or more positive axillary lymph nodes. Critics said that the adverse prognosis in the screened group contributed to the failure to detect an effect of screening. To test this hypothesis, the authors performed several multivariable analyses that adjusted for baseline differences in prognosis (1). The results of these analyses were the same as those in the main analysis, suggesting that a poor prognosis in screened women did not mask an effect of screening. Second, critics worry that nearly one quarter of the women in the usual care group had mammography during the study period, potentially biasing the study toward showing no effect. However, when Miller and colleagues included nonstudy mammography in a multivariable model to predict breast cancer death, the results were similar to the unadjusted model, suggesting that nonstudy mammography was not masking a screening effect. These new analyses of the CNBSS findings increase confidence that screening simply had no effect. We can't reasonably ignore the CNBSS or the Swedish trials, so we must compare their results to decide if they are congruent or conflicting. Many have compared the trials and concluded that they conflict. Yet comparison is more difficult than it first appears because one must first decide how to count the outcomes in each trial. The principal end point in all trials was the number of deaths from breast cancer during the 15 years or more after first screening. The issue is which breast cancer deaths to include. Some deaths are due to breast cancer diagnosed during the active intervention when screening rates differed by design. Others are due to cancer diagnosed after the intervention, when the intervention and control groups may have had similar rates of screening. The follow-up method counts all breast cancer deaths diagnosed from first mammography to the last follow-up contact. This method is unbiased and legitimate because it avoids eliminating cancer that could have been detected in the control group or missed in the screening group during the screening period (8). However, deaths from cancer detected after the screening period dilute the effect of screening. The evaluation method counts only cancer deaths detected during the intervention period. This method avoids dilution of the screening effect by cancer detected later, but it requires that all control group patients undergo screening at the same time that the last mammography is performed in the screening group (8). The CNBSS did not perform this screen. The Swedish trials adopted this approach but did not screen the control groups until about a year after they should have, in effect extending the intervention period for 1 year in the control groups and detecting cancer that would not have been detected in the screening group. The Swedish trials reported their results with both the evaluation method and the follow-up method. In light of the Swedish trials' failure to satisfy a key assumption of the evaluation method, the conservative way to report their findings is to use the follow-up method. The most recent patient-level meta-analysis of four of the Swedish trials appeared in 2002 (2). At a median follow-up time of 14.8 years, the pooled relative risk for death from breast cancer detected during the screening and follow-up period was 0.91 (95% CI, 0.76 to 1.09). (By comparison, the evaluation method led to a relative risk for breast cancer death of 0.80 [CI, 0.63 to 1.01].) The CNBSS used only the follow-up method, and the cumulative rate ratio was 0.97 (CI, 0.76 to 1.27). Because the CIs overlap considerably, the Canadian study and the meta-analysis of the Swedish trials are compatible. Furthermore, the difference between them is small, and the point estimate of effect is close to 1.0, indicating little or no effect of screening. This analysis leads me to several conclusions. First, how you count breast cancer deaths appears to matter, and therefore comparisons should use the same method of counting. Second, future meta-analyses should combine results that used the same method for counting breast cancer deaths. The U.S. Preventive Services Task Force meta-analysis, for example, combined results obtained by using different methods of counting breast cancer deaths (9). Third, we need to try to achieve consensus on which method for counting outcomes best informs the policy question before us. I believe that the relevant policy question is whether to add a policy of screening younger women 40 to 49 years of age to a policy of screening women older than 50 years of age, about which there is considerable agreement. Fourth, when the CNBSS and the Swedish trials counted outcomes of breast cancer screening by using the follow-up method, which is an approach that is legitimate and unbiased, the effect of screening women age 40 to 49 years was small. The debate about the effect of screening young women goes on, and nothing in this issue of Annals will dampen it. The debate is worth following closely because women are deciding about breast cancer screening (10), and it's our role to keep them informed as best we can.