Automated semantic indexing of figure captions to improve radiology image retrieval.

OBJECTIVE We explored automated concept-based indexing of unstructured figure captions to improve retrieval of images from radiology journals. DESIGN The MetaMap Transfer program (MMTx) was used to map the text of 84,846 figure captions from 9,004 peer-reviewed, English-language articles to concepts in three controlled vocabularies from the UMLS Metathesaurus, version 2006AA. Sampling procedures were used to estimate the standard information-retrieval metrics of precision and recall, and to evaluate the degree to which concept-based retrieval improved image retrieval. MEASUREMENTS Precision was estimated based on a sample of 250 concepts. Recall was estimated based on a sample of 40 concepts. The authors measured the impact of concept-based retrieval to improve upon keyword-based retrieval in a random sample of 10,000 search queries issued by users of a radiology image search engine. RESULTS Estimated precision was 0.897 (95% confidence interval, 0.857-0.937). Estimated recall was 0.930 (95% confidence interval, 0.838-1.000). In 5,535 of 10,000 search queries (55%), concept-based retrieval found results not identified by simple keyword matching; in 2,086 searches (21%), more than 75% of the results were found by concept-based search alone. CONCLUSION Concept-based indexing of radiology journal figure captions achieved very high precision and recall, and significantly improved image retrieval.

[1]  William R. Hersh,et al.  Research Paper: A Performance and Failure Analysis of SAPHIRE with a MEDLINE Test Collection , 1994, J. Am. Medical Informatics Assoc..

[2]  D A Evans,et al.  Automatic Indexing of Abstracts via Natural-language Processing Using a Simple Thesaurus , 1991, Medical decision making : an international journal of the Society for Medical Decision Making.

[3]  Clement J. McDonald,et al.  A Natural Language Processing System to Extract and Code Concepts Relating to Congestive Heart Failure from Chest Radiology Reports , 2006, AMIA.

[4]  José L. V. Mejino,et al.  A reference ontology for biomedical informatics: the Foundational Model of Anatomy , 2003, J. Biomed. Informatics.

[5]  T C Rindflesch,et al.  Semantic processing in information retrieval. , 1993, Proceedings. Symposium on Computer Applications in Medical Care.

[6]  Peter L. Elkin,et al.  UMLS Concept Indexing for Production Databases: A Feasibility Study , 2001, J. Am. Medical Informatics Assoc..

[7]  Henry J. Lowe,et al.  Selective Automated Indexing of Findings and Diagnoses in Radiology Reports , 2001, J. Biomed. Informatics.

[8]  David D. Lewis,et al.  Evaluating Text Categorization I , 1991, HLT.

[9]  William R. Hersh,et al.  Evaluation of SAPHIRE: an automated approach to indexing and retrieving medical literature. , 1991, Proceedings. Symposium on Computer Applications in Medical Care.

[10]  James H Thrall,et al.  Application of Recently Developed Computer Algorithm for Automatic Classification of Unstructured Radiology Reports: Validation Study 1 , 2004 .

[11]  Dan Klein,et al.  Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon , 2005 .

[12]  Hong Yu,et al.  Accessing bioscience images from abstract sentences , 2006, ISMB.

[13]  Robert H. Baud,et al.  Recent advances in natural language processing for biomedical applications , 2006, Int. J. Medical Informatics.

[14]  Hong Yu,et al.  Towards Answering Biological Questions with Experimental Evidence: Automatically Identifying Text that Summarize Image Content in Full-Text Articles , 2006, AMIA.

[15]  Marius Fieschi,et al.  Model Formulation: UMLS-based Conceptual Queries to Biomedical Information Databases: An Overview of the Project ARIANE , 1998, J. Am. Medical Informatics Assoc..

[16]  G F Cooper,et al.  CHARTLINE: providing bibliographic references relevant to patient charts using the UMLS Metathesaurus Knowledge Sources. , 1992, Proceedings. Symposium on Computer Applications in Medical Care.

[17]  Hans-Michael Müller,et al.  Textpresso: An Ontology-Based Information Retrieval and Extraction System for Biological Literature , 2004, PLoS biology.

[18]  Thomas H. Payne,et al.  Mapping to MeSH: The Art of Trapping MeSH Equivalence from within Narrative Text , 1988 .

[19]  Ricky K. Taira,et al.  A Normalized Lexical Lookup Approach to Identifying UMLS Concepts in Free Text , 2007, MedInfo.

[20]  Daniel L. Rubin,et al.  Ontology-based Annotation and Query of Tissue Microarray Data , 2006, AMIA.

[21]  C. Langlotz RadLex: a new method for indexing online educational materials. , 2006, Radiographics : a review publication of the Radiological Society of North America, Inc.

[22]  William R. Hersh,et al.  Evaluation of biomedical text-mining systems: Lessons learned from information retrieval , 2005, Briefings Bioinform..

[23]  Henning Müller,et al.  Overview of the ImageCLEFmed 2008 Medical Image Retrieval Task , 2008, CLEF.

[24]  Betsy L. Humphreys,et al.  Technical Milestone: The Unified Medical Language System: An Informatics Research Collaboration , 1998, J. Am. Medical Informatics Assoc..

[25]  Yang Huang,et al.  Research Paper: A Pilot Study of Contextual UMLS Indexing to Improve the Precision of Concept-based Representation in XML-structured Clinical Radiology Reports , 2003, J. Am. Medical Informatics Assoc..

[26]  C A Smith,et al.  Automated Semantic Indexing of Imaging Reports to Support Retrieval of Medical Images in the Multimedia Electronic Medical Record , 1999, Methods of Information in Medicine.

[27]  Yang Jin,et al.  Automated recognition of malignancy mentions in biomedical literature , 2006, BMC Bioinformatics.

[28]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[29]  J. C. Klimczak SNOMED international, the systematized nomenclature of human and veterinary medicine , 1994 .

[30]  Hans-Michael Müller,et al.  Textpresso for Neuroscience: Searching the Full Text of Thousands of Neuroscience Research Papers , 2008, Neuroinformatics.

[31]  D. Lindberg,et al.  Unified Medical Language System , 2020, Definitions.

[32]  W. G. Cole,et al.  Metaphrase: An Aid to the Clinical Conceptualization and Formalization of Patient Problems in Healthcare Enterprises , 1998, Methods of Information in Medicine.

[33]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[34]  Allen C. Browne,et al.  UMLS language and vocabulary tools. , 2003, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[35]  Marius Fieschi,et al.  UMLS-based conceptual queries to biomedical information databases: an overview of the project ARIANE. Unified Medical Language System. , 1998, Journal of the American Medical Informatics Association : JAMIA.

[36]  Charles E. Kahn,et al.  Effective Metadata Discovery for Dynamic Filtering of Queries to a Radiology Image Search Engine , 2008, Journal of Digital Imaging.

[37]  Ozlem Uzuner,et al.  Second i2b2 workshop on natural language processing challenges for clinical records. , 2008, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[38]  Cheng Thao,et al.  GoldMiner: a radiology image search engine. , 2007, AJR. American journal of roentgenology.

[39]  C. Sebastià,et al.  Portomesenteric vein gas: pathologic mechanisms, CT findings, and prognosis. , 2000, Radiographics : a review publication of the Radiological Society of North America, Inc.

[40]  Alan R. Aronson,et al.  Application of a Medical Text Indexer to an Online Dermatology Atlas , 2004, MedInfo.

[41]  Daniel L. Rubin,et al.  Creating and Curating a Terminology for Radiology: Ontology Modeling and Analysis , 2008, Journal of Digital Imaging.

[42]  Clement J. McDonald,et al.  Automated Extraction and Normalization of Findings from Cancer-Related Free-Text Radiology Reports , 2003, AMIA.

[43]  T C Rindflesch,et al.  Ambiguity resolution while mapping free text to the UMLS Metathesaurus. , 1994, Proceedings. Symposium on Computer Applications in Medical Care.