Limited usefulness of observer-based cosmesis scales employed to evaluate patients treated conservatively for breast cancer.

We evaluated the relative usefulness of two observer-based scales commonly employed to assess the cosmetic outcome of patients treated by breast-preserving techniques for breast cancer. We asked 44 volunteer observers to employ one or the other scale to assess cosmetic outcome in a series of 14 projected color photographs of frontal views of treated patients. Our results demonstrate that observer concensus with either scale is rarely attained, particularly for patients with T1 or T2 tumors. Experienced observers could reach a concensus more often, although still infrequently. Moreover, the reliability of both scales is poor, since approximately one-third of observers evaluating one photograph twice during the same test session changed their answer. We conclude that while observer-based cosmesis scales demonstrate that current surgical and radiation therapy techniques can provide a "good" cosmetic result in 66-90% of patients with Stage I or II breast cancer, they lack the sensitivity and reliability to evaluate factors affecting cosmetic outcome since all forms of cosmetic change are lumped together into one assessment. Each type of cosmetic change should be evaluated separately by objective measures to determine factors related to its development.