Automatically-Extracted Thesauri for Cross-Language IR: When Better is Worse
暂无分享,去创建一个
A statistical algorithm for extracting bilingual term dictionaries (thesauri) from parallel text is presented, along with reenements for improving their size and accuracy. Somewhat paradoxically , increasing the accuracy of the extracted thesaurus can in fact reduce the performance of an IR system using it to perform query translation for cross-language information retrieval.
[1] Gerald Salton,et al. Automatic text processing , 1988 .