Combining Query Translation and Document Translation in Cross-Language Retrieval

This paper describes monolingual, bilingual, and multilingual retrieval experiments using the CLEF 2003 test collection. The paper compares query translation-based multilingual retrieval with document translation-based multilingual retrieval where the documents are translated into the query language by translating the document words individually using machine translation systems or statistical translation lexicons derived from parallel texts. The multilingual retrieval results show that document translation-based retrieval is slightly better than the query translation-based retrieval on the CLEF 2003 test collection. Furthermore, combining query translation and document translation in multilingual retrieval achieves even better performance.

[1]  Carol Peters,et al.  Evaluation of Cross-Language Information Retrieval Systems , 2002, Lecture Notes in Computer Science.

[2]  Maija Hellikki Aaltio Finnish for foreigners , 1981 .

[3]  Ted Dunning,et al.  Accurate Methods for the Statistics of Surprise and Coincidence , 1993, CL.

[4]  Carol Peters,et al.  Cross-Language Information Retrieval and Evaluation , 2001, Lecture Notes in Computer Science.

[5]  Douglas W. Oard,et al.  CLEF Experiments at Maryland: Statistical Stemming and Backoff Translation , 2000, CLEF.

[6]  Fredric C. Gey,et al.  Building an Arabic Stemmer for Information Retrieval , 2002, TREC.

[7]  Ellen M. Voorhees,et al.  The eleventh text REtrieval conference, TREC 2002 , 2003 .

[8]  Philip Holmes,et al.  Swedish: A Comprehensive Grammar , 1994 .

[9]  Dania Egedi,et al.  A freely available wide coverage morphological analyzer for English , 1992, COLING 1992.

[10]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[11]  Aitao Chen,et al.  Cross-language Retrieval Experiments at CLEF 2002 , 2002, CLEF.

[12]  Carol Peters,et al.  Advances in cross-language information retrieval : third Workshop of the Cross-Language Evaluation Forum, CLEF 2002, Rome, Itary, September 19-20, 2002 : revised papers , 2003 .

[13]  Martin Braschler,et al.  Experiments with the Eurospider Retrieval System for CLEF 2001 , 2000, CLEF.

[14]  Martin Braschler,et al.  Experiments with the Eurospider Retrieval System for CLEF 2000 , 2000, CLEF.

[15]  Fredric C. Gey,et al.  Berkeley at NTCIR-2: Chinese, Japanese, and English IR experiments , 2001, NTCIR.

[16]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.