Semantic relations in concept-based cross-language medical information retrieval

We explore and evaluate the usefulness of semantic annotation, particularly semantic relations, in cross-language information retrieval in the medical domain. As the baseline for automatic semantic annotation we use UMLS, which specifies semantic relations between medical concepts. We developed two methods to improve the accuracy and yield of relations in CLIR: a method for relation filtering and a method to discover new relation instances. Both techniques were applied to a corpus of English and German medical abstracts and evaluated for their efficiency in CLIR. Results show that filtering reduces recall without significant increase in precision, while discovery of new relation instances indeed proved a successful method to improve retrieval.

[1]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[2]  Djoerd Hiemstra,et al.  Cross Language Retrieval with the Twenty-One system , 1997, TREC.

[3]  Douglas W. Oard,et al.  A comparative study of query and document translation for cross-language information retrieval , 1998, AMTA.

[4]  Takenobu Tokunaga,et al.  Complementing WordNet with Roget’s and Corpus-based Thesauri for Information Retrieval , 1999, EACL.

[5]  Emmanuel Morin,et al.  Extracting Semantic Relationships between Terms: Supervised vs. Unsupervised Methods , 1999 .

[6]  Steffen Staab,et al.  ECAI'2000 Workshop on Ontology Learning, Proceedings of the First Workshop on Ontology Learning OL'2000, Berlin, Germany, August 25, 2000. Held in conjunction with the 14th European Conference on Artificial Intelligence ECAI'2000, Berlin, Germany , 2000, ECAI Workshop on Ontology Learning.

[7]  Paul Buitelaar,et al.  Semantic annotation for concept-based cross-language medical information retrieval , 2002, Int. J. Medical Informatics.

[8]  Yonggang Qiu Automatic query expansion based on a similarity thesaurus , 1995 .

[9]  M. Felisa Verdejo,et al.  Using Eurowordnet in a Concept-Based Approach to Cross-Language Text Retrieval , 1999, Appl. Artif. Intell..

[10]  J. Cimino,et al.  Automatic knowledge acquisition from MEDLINE. , 1993, Methods of information in medicine.

[11]  Éric Gaussier Flow Network Models for Word Alignment and Terminology Extraction from Bilingual Corpora , 1998, COLING-ACL.

[12]  Günter Neumann,et al.  An Intelligent Text Extraction and Navigation System , 2000, RIAO.

[13]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[14]  Gregory Grefenstette,et al.  Xerox TREC-6 Site Report: Cross Language Text Retrieval , 1997, TREC.

[15]  Padmini Srinivasan,et al.  Cross-language information retrieval with the UMLS metathesaurus , 1998, SIGIR '98.

[16]  Gregory Grefenstette,et al.  Querying across languages: a dictionary-based approach to multilingual information retrieval , 1996, SIGIR '96.

[17]  Gilles Bisson,et al.  Designing Clustering Methods for Ontology Building - The Mo'K Workbench , 2000, ECAI Workshop on Ontology Learning.

[18]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[19]  Wojciech Skut,et al.  A Maximum-Entropy Partial Parser for Unrestricted Text , 1998, VLC@COLING/ACL.

[20]  Fredric C. Gey,et al.  English-German Cross-Language Retrieval for the GIRT Collection - Exploiting a Multilingual Thesaurus , 1999, TREC.