论文信息 - Automatically-Extracted Thesauri for Cross-Language IR: When Better is Worse

Automatically-Extracted Thesauri for Cross-Language IR: When Better is Worse

A statistical algorithm for extracting bilingual term dictionaries (thesauri) from parallel text is presented, along with reenements for improving their size and accuracy. Somewhat paradoxically , increasing the accuracy of the extracted thesaurus can in fact reduce the performance of an IR system using it to perform query translation for cross-language information retrieval.

Ralf D. Brown

[1] Gerald Salton,et al. Automatic text processing , 1988 .