Exploring access to scientific literature using content-based image retrieval

The number of articles published in the scientific medical literature is continuously increasing, and Web access to the journals is becoming common. Databases such as SPIE Digital Library, IEEE Xplore, indices such as PubMed, and search engines such as Google provide the user with sophisticated full-text search capabilities. However, information in images and graphs within these articles is entirely disregarded. In this paper, we quantify the potential impact of using content-based image retrieval (CBIR) to access this non-text data. Based on the Journal Citations Report (JCR), the journal Radiology was selected for this study. In 2005, 734 articles were published electronically in this journal. This included 2,587 figures, which yields a rate of 3.52 figures per article. Furthermore, 56.4% of these figures are composed of several individual panels, i.e. the figure combines different images and/or graphs. According to the Image Cross-Language Evaluation Forum (ImageCLEF), the error rate of automatic identification of medical images is about 15%. Therefore, it is expected that, by applying ImageCLEF-like techniques, already 95.5% of articles could be retrieved by means of CBIR. The challenge for CBIR in scientific literature, however, is the use of local texture properties to analyze individual image panels in composite illustrations. Using local features for content-based image representation, 8.81 images per article are available, and the predicted correctness rate may increase to 98.3%. From this study, we conclude that CBIR may have a high impact in medical literature research and suggest that additional research in this area is warranted.

[1]  Hermann Ney,et al.  The CLEF 2005 Automatic Medical Image Annotation Task , 2006, International Journal of Computer Vision.

[2]  Anil K. Jain,et al.  Image classification for content-based indexing , 2001, IEEE Trans. Image Process..

[3]  James S. Duncan,et al.  Synthesis of Research: Medical Image Databases: A Content-based Retrieval Approach , 1997, J. Am. Medical Informatics Assoc..

[4]  Thomas Martin Deserno,et al.  The CLEF 2005 Cross-Language Image Retrieval Track , 2003, CLEF.

[5]  Agma J. M. Traina,et al.  Using an image-extended relational database to support content-based image retrieval in a PACS , 2005, Comput. Methods Programs Biomed..

[6]  Hermann Ney,et al.  Automatic categorization of medical images for content-based retrieval and data mining. , 2005, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society.

[7]  Antoine Geissbühler,et al.  A Review of Content{Based Image Retrieval Systems in Medical Applications { Clinical Bene(cid:12)ts and Future Directions , 2022 .

[8]  Hermann Ney,et al.  Extended query refinement for content-based access to large medical-image databases , 2004, SPIE Medical Imaging.

[9]  Michael Kohnen,et al.  The IRMA code for unique classification of medical images , 2003, SPIE Medical Imaging.

[10]  T M Lehmann,et al.  Content-based Image Retrieval in Medical Applications , 2004, Methods of Information in Medicine.

[11]  Thomas Martin Deserno,et al.  Strukturprototypen zur Modellierung medizinischer Bildinhalte , 2006, Bildverarbeitung für die Medizin.

[12]  Christos Faloutsos,et al.  QBIC project: querying images by content, using color, texture, and shape , 1993, Electronic Imaging.

[13]  Sameer Antani,et al.  Gaps in content-based image retrieval , 2007, SPIE Medical Imaging.

[14]  Eugene Kim,et al.  Overview of the ImageCLEFmed 2006 Medical Retrieval and Annotation Tasks , 2006, CLEF.

[15]  Joachim M. Buhmann,et al.  Empirical evaluation of dissimilarity measures for color and texture , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[16]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..