Prediction of Performance on Cross-Lingual Information Retrieval by Regression Models
暂无分享,去创建一个
The purpose of this paper is to examine empirically factors having effects on performance of cross-lingual information retrieval. In order to obtain experimental data, at the NTCIR-4 CLIR task, we submitted search results of Japanese monolingual run and three bilingual runs retrieving the Japanese document collection (i.e., Chinese-Japanese, Korean-Japanese and English-Japanese runs). It turns out that a regression model of which independent variables are “quality” of query translation and “difficulty” of the search in itself explains well variations of values of average precision by CLIR runs. The “quality” of translations was measured as a score assigned by a human assessor based on the degree to which each translation is coincident with the corresponding term in the Japanese topic that the task organizers provided, and the “difficulty” of the search was represented as a value of average precision by a run using the Japanese topic (i.e., monolingual run).
[1] Stephen E. Robertson,et al. Okapi at TREC-3 , 1994, TREC.
[2] Stephen E. Robertson,et al. GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .