A reference data set for the evaluation of medical image retrieval systems.

Content-based image retrieval is starting to become an increasingly important factor in medical imaging research and image management systems. Several retrieval systems and methodologies exist and are used in a large variety of applications from automatic labelling of images to diagnostic aid and image classification. Still, it is very hard to compare the performance of these systems as the used databases often contain copyrighted or private images and are thus not interchangeable between research groups, also for patient privacy. Most of the currently used databases for evaluating systems are also fairly small which is partly due to the high cost in obtaining a gold standard or ground truth that is necessary for evaluation. Several large image databases, though without a gold standard, start to be available publicly, for example by the NIH (National Institutes for Health). This article describes the creation of a large medical image database that is used in a teaching file containing more than 8,700 varied medical images. The images are anonymised and can be exchanged free of charge and copyright. Ground truth (a gold standard) has been obtained for a set of 26 images being selected as query topics for content-based query by image example. To reduce the time for the generation of ground truth, pooling methods well known from the text or information retrieval field have been used. Such a database is a good starting point for comparing the current image retrieval systems and to measure the retrieval quality, especially within the context of teaching files, image case databases and the support of teaching. For a comparison of retrieval systems for diagnostic aid, specialised image databases, including the diagnosis and a case description will need to be made available, as well, including gold standards for a proper system evaluation. A first evaluation event for image retrieval is foreseen at the 2004 CLEF conference (Cross Language Evaluation Forum) to compare text-and content-based access mechanism to images.

[1]  A. Kak,et al.  Automated storage and retrieval of thin-section CT images to assist diagnosis: system description and preliminary assessment. , 2003, Radiology.

[2]  Ellen M. Voorhees Variations in relevance judgments and the measurement of retrieval effectiveness , 2000, Inf. Process. Manag..

[3]  B. S. Manjunath,et al.  Color and texture descriptors , 2001, IEEE Trans. Circuits Syst. Video Technol..

[4]  James S. Duncan,et al.  Synthesis of Research: Medical Image Databases: A Content-based Retrieval Approach , 1997, J. Am. Medical Informatics Assoc..

[5]  Dan Schonfeld,et al.  VORTEX: video retrieval and tracking from compressed multimedia databases--template matching from MPEG-2 video compression standard , 1998, Other Conferences.

[6]  Cyril W. Cleverdon,et al.  Aslib Cranfield research project - Factors determining the performance of indexing systems; Volume 1, Design; Part 2, Appendices , 1966 .

[7]  Dragutin Petkovic,et al.  Query by Image and Video Content: The QBIC System , 1995, Computer.

[8]  Maurice B. Line,et al.  PROGRESS IN DOCUMENTATION: ‘obsolescence’ and changes in the use of literature with time , 1974 .

[9]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[10]  Amarnath Gupta,et al.  Virage image search engine: an open framework for image management , 1996, Electronic Imaging.

[11]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[12]  Thierry Pun,et al.  A Framework for Benchmarking in CBIR , 2003, Multimedia Tools and Applications.

[13]  Ellen M. Voorhees,et al.  Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.

[14]  Alberto Del Bimbo,et al.  Visual information retrieval , 1999 .

[15]  K. Sparck Jones,et al.  INFORMATION RETRIEVAL TEST COLLECTIONS , 1976 .

[16]  Colin C. Venters,et al.  A Review of Content-Based Image Retrieval Systems , 1982 .

[17]  Cyril W. Cleverdon,et al.  Factors determining the performance of indexing systems , 1966 .

[18]  David Dagan Feng,et al.  Content-based retrieval of dynamic PET functional images , 2000, IEEE Transactions on Information Technology in Biomedicine.

[19]  Philippe Schmid-Saugeona,et al.  Towards a computer-aided diagnosis system for pigmented skin lesions. , 2003, Computerized medical imaging and graphics : the official journal of the Computerized Medical Imaging Society.

[20]  Antoine Geissbühler,et al.  Integrating Content-Based Visual Access Methods into a Medical Case Database , 2003, MIE.

[21]  O. Ratib,et al.  Integration of a multimedia teaching and reference database in a PACS environment. , 2002, Radiographics : a review publication of the Radiological Society of North America, Inc.

[22]  Henry J. Lowe,et al.  Towards knowledge-based retrieval of medical images. The role of semantic indexing, image content representation and knowledge-based retrieval , 1998, AMIA.

[23]  Alexander Horsch,et al.  How to Identify and Assess Tasks and Challenges of Medical Image Processing , 2003, MIE.

[24]  Donna Harman,et al.  Overview of the First Text REtrieval Conference. , 1993, SIGIR 1993.

[25]  Alex Pentland,et al.  Photobook: Content-based manipulation of image databases , 1996, International Journal of Computer Vision.

[26]  Christos Faloutsos,et al.  Fast and Effective Retrieval of Medical Tumor Shapes , 1998, IEEE Trans. Knowl. Data Eng..

[27]  Wesley E. Snyder,et al.  Content-Based Image Retrieval in PACS , 1999 .

[28]  Aleksandra Mojsilovic,et al.  Semantic based categorization, browsing and retrieval in medical image databases , 2002, Proceedings. International Conference on Image Processing.

[29]  Justin Zobel,et al.  How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.

[30]  Michael B. Eisenberg,et al.  A re-examination of relevance: toward a dynamic, situational definition , 1990, Inf. Process. Manag..

[31]  Clement H. C. Leung,et al.  Benchmarking for Content-Based Visual Information Search , 2000, VISUAL.

[32]  Mark Sanderson,et al.  The CLEF 2003 Cross Language Image Retrieval Task , 2003, CLEF.

[33]  Patrice Degoulet,et al.  Towards content-based image retrieval in a HIS-integrated PACS , 2000, AMIA.

[34]  Agma J. M. Traina,et al.  A Support System for Content-Based Medical Image Retrieval in Object Oriented Databases , 2004, Journal of Medical Systems.

[35]  Ellen M. Voorhees,et al.  Overview of the seventh text retrieval conference (trec-7) [on-line] , 1999 .

[36]  Cyril W. Cleverdon,et al.  Aslib Cranfield research project: report on the testing and analysis of an investigation into the comparative efficiency of indexing systems , 1962 .

[37]  Ellen M. Voorhees,et al.  The seventh text REtrieval conference (TREC-7) , 1999 .

[38]  Alex Pentland,et al.  Photobook: tools for content-based manipulation of image databases , 1994, Electronic Imaging.

[39]  J Dudeck,et al.  Evaluation of clinical information systems. What can be evaluated and what cannot? , 2001, Journal of evaluation in clinical practice.

[40]  C. J. van Rijsbergen,et al.  Report on the need for and provision of an 'ideal' information retrieval test collection , 1975 .

[41]  Michael Kohnen,et al.  Content-based image retrieval in medical applications for picture archiving and communication systems , 2003, SPIE Medical Imaging.

[42]  Tefko Saracevic,et al.  RELEVANCE: A review of and a framework for the thinking on the notion in information science , 1997, J. Am. Soc. Inf. Sci..

[43]  Sethuraman Panchanathan,et al.  Multimedia Storage and Archiving Systems III , 1998 .

[44]  Joshua R. Smith,et al.  Image retrieval evaluation , 1998, Proceedings. IEEE Workshop on Content-Based Access of Image and Video Libraries (Cat. No.98EX173).

[45]  Neil J. Gunther,et al.  Benchmark for image retrieval using distributed systems over the Iinternet: BIRDS-I , 2000, IS&T/SPIE Electronic Imaging.

[46]  Thomas S. Huang,et al.  Relevance feedback: a power tool for interactive content-based image retrieval , 1998, IEEE Trans. Circuits Syst. Video Technol..

[47]  Gerard Salton,et al.  The SMART Retrieval System , 1971 .

[48]  Antoine Geissbühler,et al.  A Review of Content{Based Image Retrieval Systems in Medical Applications { Clinical Bene(cid:12)ts and Future Directions , 2022 .

[49]  Hermann Ney,et al.  Statistical framework for model-based image retrieval in medical applications , 2003, J. Electronic Imaging.

[50]  Herbert Coblans,et al.  Progress in Documentation. , 1972 .

[51]  Rudolf Hanka,et al.  A review of intelligent content-based indexing and browsing of medical images , 1999 .

[52]  Marcel Worring,et al.  Content-Based Image Retrieval at the End of the Early Years , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Carla E. Brodley,et al.  ASSERT: A Physician-in-the-Loop Content-Based Retrieval System for HRCT Image Databases , 1999, Comput. Vis. Image Underst..

[54]  Alexander Horsch,et al.  Establishing an International Reference Image Database for Research and Development in Medical Image Processing , 2004, Methods of Information in Medicine.

[55]  Donna K. Harman,et al.  Overview of the First Text REtrieval Conference (TREC-1) , 1992, TREC.

[56]  Jitendra Malik,et al.  Blobworld: A System for Region-Based Image Indexing and Retrieval , 1999, VISUAL.

[57]  Mohan S. Kankanhalli,et al.  Benchmarking Multimedia Databases , 1997, Multimedia Tools and Applications.

[58]  C. Cleverdon Report on the testing and analysis of an investigation into comparative efficiency of indexing systems , 1962 .

[59]  Perry L. Miller,et al.  Research Paper: PathMaster: Content-based Cell Image Retrieval Using Automated Feature Extraction , 2000, J. Am. Medical Informatics Assoc..

[60]  Euripides G. M. Petrakis,et al.  Similarity Searching in Medical Image Databases , 1997, IEEE Trans. Knowl. Data Eng..

[61]  Carla E. Brodley,et al.  The customized-queries approach to CBIR using EM , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[62]  L. Rodney Long,et al.  A Biomedical Information System for Combined Content-Based Retrieval of Spine X-Ray Images, Associated Text Information , 2002, ICVGIP.

[63]  F. Schnorrenberg,et al.  Content-based retrieval of breast cancer biopsy slides. , 2000, Technology and health care : official journal of the European Society for Engineering and Medicine.

[64]  Ellen M. Voorhees,et al.  Overview of the Seventh Text REtrieval Conference , 1998 .

[65]  Nasser Kehtarnavaz,et al.  Classification of breast mass abnormalities using denseness and architectural distortion , 2002 .

[66]  Vimla L. Patel,et al.  Cognitive and usability engineering methods for the evaluation of clinical information systems , 2004, J. Biomed. Informatics.