We are comparing two different methods for obtaining the radiologists’ subjective impression of similarity, for application in distinguishing benign from malignant lesions. Thirty pairs of mammographic clustered calcifications were used in this study. These 30 pairs were rated on a 5-point scale as to their similarity, where 1 was nearly identical and 5 was not at all similar. After this, all possible combinations of pairs of pairs were shown to the reader (n=435) and the reader selected which pair was most similar. This experiment was repeated by the observers with at least a week between reading sessions. Using analysis of variance, intra-class correlation coefficients (ICC) were calculated for both absolute scoring method and paired comparison method. In addition, for the paired comparison method, the coefficient of consistency within each reader was calculated. The average coefficient of consistence for the 4 readers was 0.88 (range 0.49-0.97). These results were statistically significant different from guessing at p << 0.0001. The ICC for intra-reader agreement was 0.51 (0.37-0.66 95% CI) for the absolute method and 0.82 (0.73-0.91 95% CI) for the paired comparison method. This difference was statistically significant (p=0.001). For the inter-reader agreement, the ICC for the absolute method was 0.39 (0.21-0.57 95% CI) and 0.37 (0.18-0.56 95% CI) for the paired comparison method. We conclude that humans are able to judge similarity of clustered calcifications in a meaningful way. Further, radiologists had greater intra-reader agreement when using the paired comparison method than when using an absolute rating scale. Differences in the criteria used by different observers to judge similarity and differences in interpreting which calcifications comprise the cluster can lead to low ICC values for inter-reader agreement for both methods.