A Scalable Reference Standard of Visual Similarity for a Content-Based Image Retrieval System

In order to develop content based image retrieval (CBIR) systems, a robust reference standard of similarity between pairs of images is required, but challenging to create given the large number of pair-wise comparisons. We demonstrated a novel method of creating one for liver tumors seen in 19 portal venous CT scans by computing image similarity from subjective ratings of attributes on single images. We gathered ratings with 6- and 9-point scales for liver lesions displayed individually (P1: ratings for 6 visual attributes) and in all 171 pair-wise combinations (P2: ratings for dissimilarity in the 6 attributes and overall dissimilarity) from 3 radiologists. We averaged readers' ratings and fit the absolute attribute rating differences in P1 to ratings in P2. The R-squared value between pair-wise attribute dissimilarities and overall pair-wise dissimilarity was 0.65, and between a linear combination of the absolute differences of ratings for each attribute and overall pair-wise dissimilarity was 0.46. For overall dissimilarity, pairs of readers showed agreement to within 2 points in 64-84% of all ratings. Hence, this scalable method is feasible for creating a reference standard for CBIR.

[1]  K. Doi,et al.  Determination of subjective similarity for pairs of masses and pairs of clustered microcalcifications on mammograms: comparison of similarity ranking scores and absolute similarity ratings. , 2007, Medical physics.

[2]  Kunio Doi,et al.  Presentation of Similar Images as a Reference for Distinction Between Benign and Malignant Masses on Mammograms: Analysis of Initial Observer Study , 2009, Journal of Digital Imaging.

[3]  J. Sim,et al.  The kappa statistic in reliability studies: use, interpretation, and sample size requirements. , 2005, Physical therapy.

[4]  J D Carroll,et al.  MULTIDIMENSIONAL SCALING , 2002 .

[5]  L. Rodney Long,et al.  Bridging the Gap: Enabling CBIR in Medical Applications , 2008, 2008 21st IEEE International Symposium on Computer-Based Medical Systems.

[6]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[7]  Henning Müller,et al.  A reference data set for the evaluation of medical image retrieval systems. , 2004, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society.

[8]  K. Doi,et al.  Investigation of psychophysical measure for evaluation of similar images for mammographic masses: preliminary results. , 2005, Medical physics.

[9]  Kimberly E Applegate,et al.  Learning Radiology: A Survey Investigating Radiology Resident Use of Textbooks, Journals, and the Internet , 2007 .

[10]  Antoine Geissbühler,et al.  A Review of Content{Based Image Retrieval Systems in Medical Applications { Clinical Bene(cid:12)ts and Future Directions , 2022 .

[11]  James Ze Wang,et al.  Image retrieval: Ideas, influences, and trends of the new age , 2008, CSUR.

[12]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.