Exploiting a Chinese-English bilingual wordlist for English-Chinese cross language information retrieval

We investigated using the LDC English/Chinese bilingual wordlists for English-Chinese cross language retrieval. It is shown that the Chinese-to-English wordlist can be considered as both a phrase and word dictionary, and is preferable to the English-to-Chinese version in terms of phrase translations and word translation selection. Additional techniques such as target corpus frequency-based term selection and weighting were employed. Experiments show that over 70% of monolingual effectiveness is achievable for the TREC Chinese corpus and retrieval environment with short queries of a few English words.