Construction of a Fuzzy Multilingual Thesaurus and Its Application to Cross-Lingual Text Retrieval

Cross-lingual text retrieval (CLTR) is a problem of vocabulary mismatch. To allow multilingual term matching, a multilingual thesaurus is used. However, a multilingual thesaurus encoded with exact translation equivalent only is insufficient for effective CLTR since relevant documents are often indexed by cross-lingual related term. In this paper, a novel approach for automatically constructing a multilingual thesaurus based on fuzzy set theory is proposed. By introducing a degree of relatedness between multilingual terms using the concept of membership degree, partial match of cross-lingual related terms is facilitated.