Query expansion with a medical ontology to improve a multimodal information retrieval system

Searching biomedical information in a large collection of medical data is a complex task. The use of tools and biomedical resources could ease the retrieval of the information desired. In this paper, we use the medical ontology MeSH to improve a Multimodal Information Retrieval System by expanding the user's query with medical terms. In order to accomplish our experiments, we have used the dataset provided by ImageCLEFmed task organizers for years 2005 and 2006. This dataset is composed of a multimodal collection (images and text) of clinical cases, a list of queries for each year, and a list of relevance judgments for each query to evaluate the results. The results from the experiments show that the use of a medical ontology to expand the queries greatly improves the results.

[1]  Joo-Hwee Lim,et al.  Using Ontology Dimensions and Negative Expansion to solve Precise Queries in the ImageCLEF Medical Task , 2005, CLEF.

[2]  Joo-Hwee Lim,et al.  IPAL Knowledge-based Medical Image Retrieval in ImageCLEFmed 2006 , 2006, CLEF.

[3]  Betsy L. Humphreys,et al.  Relationships in Medical Subject Headings (MeSH) , 2001 .

[4]  Miguel Ángel García Cumbreras,et al.  BRUJA System. The University of Jaén at the Spanish Task of CLEFQA 2006 , 2006, CLEF.

[5]  William R. Hersh,et al.  Assessing thesaurus-based query expansion using the UMLS Metathesaurus , 2000, AMIA.

[6]  Thomas R. Gruber,et al.  Toward principles for the design of ontologies used for knowledge sharing? , 1995, Int. J. Hum. Comput. Stud..

[7]  Torulf Mollestad,et al.  Additional Gene Ontology structure for improved biological reasoning , 2006, Bioinform..

[8]  Kristopher N Jones,et al.  Group for research in pathology education online resources to facilitate pathology instruction. , 2002, Archives of pathology & laboratory medicine.

[9]  Carole A. Goble,et al.  Ontology-based Knowledge Representation for Bioinformatics , 2000, Briefings Bioinform..

[10]  Kristina Nilsson,et al.  SUiS–cross-language ontology-driven information retrieval in a restricted domain , 2006, NODALIDA.

[11]  Dina Demner-Fushman,et al.  Application of Information Technology: Essie: A Concept-based Search Engine for Structured Biomedical Text , 2007, J. Am. Medical Informatics Assoc..

[12]  Alan R. Aronson,et al.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[13]  Cathy H. Wu,et al.  UniProt: the Universal Protein knowledgebase , 2004, Nucleic Acids Res..

[14]  Martin F. Porter,et al.  An algorithm for suffix stripping , 1997, Program.

[15]  Pablo Castells,et al.  An Ontology-Based Information Retrieval Model , 2005, ESWC.

[16]  David E. Millard,et al.  Automatic Ontology-Based Knowledge Extraction from Web Documents , 2003, IEEE Intell. Syst..

[17]  Wei-Pang Yang,et al.  KIDS's Evaluation in the Medical Image Retrieval Task at ImageCLEF 2004 , 2004, CLEF.

[18]  P. Smith,et al.  A review of ontology based query expansion , 2007, Inf. Process. Manag..

[19]  Olivier Bodenreider,et al.  The Unified Medical Language System (UMLS): integrating biomedical terminology , 2004, Nucleic Acids Res..

[20]  O. Ratib,et al.  Casimage Project: A Digital Teaching Files Authoring Environment , 2004, Journal of thoracic imaging.

[21]  Wei-Pang Yang,et al.  NCTU_DBLAB@ImageCLEFmed 2005: Medical Image Retrieval Task , 2005, CLEF.

[22]  Eugene Kim,et al.  Overview of the ImageCLEFmed 2006 Medical Retrieval and Annotation Tasks , 2006, CLEF.

[23]  Manuel de Buenaga Rodríguez,et al.  Using WordNet to Complement Training Information in Text Categorization , 1997, ArXiv.

[24]  Lisa F. Rau,et al.  Information extraction and text summarization using linguistic knowledge acquisition , 1989, Inf. Process. Manag..

[25]  Stuart J. Nelson,et al.  The MeSH Translation Maintenance System: Structure, Interface Design, and Implementation , 2004, MedInfo.

[26]  Miguel E. Ruiz,et al.  UB at CLEF 2005: Medical Image Retrieval Task , 2005, CLEF.

[27]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[28]  Miguel E. Ruiz UB at ImageCLEFmed 2006 , 2006, CLEF.

[29]  Jung-Hsien Chiang,et al.  GeneLibrarian: an effective gene-information summarization and visualization system , 2006, BMC Bioinformatics.

[30]  J. Wallis,et al.  An Internet-based nuclear medicine teaching file. , 1995, Journal of nuclear medicine : official publication, Society of Nuclear Medicine.

[31]  Amos Bairoch,et al.  Swiss-Prot: Juggling between evolution and stability , 2004, Briefings Bioinform..

[32]  Alexander C. Yu,et al.  Methods in biomedical ontology , 2006, J. Biomed. Informatics.

[33]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[34]  Thierry Pun,et al.  Design and evaluation of a content-based image retrieval system , 2001 .

[35]  Julio Gonzalo,et al.  Indexing with WordNet synsets can improve text retrieval , 1998, WordNet@ACL/COLING.

[36]  Dietrich Rebholz-Schuhmann,et al.  Distributed modules for text annotation and IE applied to the biomedical domain , 2004 .

[37]  Wendy Hall,et al.  Conceptual linking: ontology-based open hypermedia , 2001, WWW '01.

[38]  Roberto Navigli,et al.  An analysis of ontology-based query expansion strategies , 2003 .

[39]  Luis Alfonso Ureña López,et al.  The learning vector quantization algorithm applied to automatic text classification tasks , 2007, Neural Networks.

[40]  Ellen M. Voorhees,et al.  Evaluating Evaluation Measure Stability , 2000, SIGIR 2000.