Cross-language information retrieval with the UMLS metathesaurus

We investigate an automatic method for Cross Language Information Retrieval (CLIR) that utilizes the multilingual UMLS Metathesaurus to translate Spanish and French natural language queries into English. Two experiments are presented using OHSUMED, a subset of MEDLINE. Both experiments examine retrieval effectiveness of the translated queries. However, in the second experiment, the query translation procedure is augmented with digram based vocabulary normalization procedures. In this comparative study of retrieval effectiveness the measures used are: 11-point-average precision score (11-AvgP); average interpolated precision at recall of 0.1; and noninterpolated (i.e., exact) precision after 10 retrieved documents. Our results indicate that for Spanish the UMLS Metathesaurus based CLIR method appears equivalent to multilingual dictionary based approaches investigated in the current literature French yields less favorable results and our analysis suggests that linguistic differences may have caused the performance differences.

[1]  Susan T. Dumais,et al.  Automatic 3-Language Cross-Language Information Retrieval with Latent Semantic Indexing , 1997, TREC.

[2]  Douglas W. Oard,et al.  Document Translation for Cross-Language Text Retrieval at the University of Maryland , 1997, TREC.

[3]  Michael L. Littman,et al.  Automatic Cross-Language Retrieval Using Latent Semantic Indexing , 1997 .

[4]  Douglas W. Oard,et al.  Alternative Approaches for Cross-Language Text Retrieval , 1997 .

[5]  W. Bruce Croft,et al.  Dictionary Methods for Cross-Lingual Information Retrieval , 1996, DEXA.

[6]  Peter Schäuble,et al.  ETH TREC-6: Routing, Chinese, Cross-Language and Spoken Document Retrieval , 1997, TREC.

[7]  James Allan,et al.  Automatic Query Expansion Using SMART: TREC 3 , 1994, TREC.

[8]  Dagobert Soergel,et al.  Multilingual Thesauri in Cross-Language Text and Speech Retrieval , 1997 .

[9]  Peter Schäuble,et al.  Cross-language speech retrieval: establishing a baseline performance , 1997, SIGIR '97.

[10]  M. Wechsler Cross-language Speech Retrieval , 1997 .

[11]  Mark W. Davis,et al.  New Experiments In Cross-Language Text Retrieval At NMSU's Computing Research Lab , 1996, TREC.

[12]  Gerard Salton,et al.  The SMART Retrieval System—Experiments in Automatic Document Processing , 1971 .

[13]  P. Srinivasan Retrieval feedback in MEDLINE. , 1996, Journal of the American Medical Informatics Association : JAMIA.

[14]  Gregory Grefenstette,et al.  Xerox TREC-6 Site Report: Cross Language Text Retrieval , 1997, TREC.

[15]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[16]  Gregory Grefenstette,et al.  Querying across languages: a dictionary-based approach to multilingual information retrieval , 1996, SIGIR '96.

[17]  Gerard Salton,et al.  Experiments in Multi-Lingual Information Retrieval , 1972, Inf. Process. Lett..

[18]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[19]  Jean Paul Ballerini,et al.  Experiments in multilingual information retrieval using the SPIDER system , 1996, SIGIR '96.

[20]  Donna K. Harman,et al.  Overview of the Third Text REtrieval Conference (TREC-3) , 1995, TREC.

[21]  Robert G. Reynolds,et al.  Query Translation Using Evolutionary Programming for Multi-Lingual Information Retrieval , 1995 .

[22]  James Allan,et al.  INQUERY Does Battle With TREC-6 , 1997, TREC.

[23]  Julio Gonzalo,et al.  An Approach to Conceptual Text Retrieval Using the EuroWordNet Multilingual Semantic Database , 1997 .

[24]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[25]  W. Bruce Croft,et al.  Phrasal translation and query expansion techniques for cross-language information retrieval , 1997, SIGIR '97.

[26]  Gerard Salton,et al.  Automatic Processing of Foreign Language Documents , 1969, COLING.

[27]  Mark W. Davis,et al.  Free Resources And Advanced Alignment For Cross-Language Text Retrieval , 1997, TREC.

[28]  Carol Peters,et al.  Cross-Language Information Retrieval (CLIR) Track Overview , 1997, TREC.

[29]  Douglas W. Oard,et al.  A survey of multilingual text retrieval , 1996 .