Should we Translate the Documents or the Queries in Cross-language Information Retrieval?

Previous comparisons of document and query translation suffered difficulty due to differing quality of machine translation in these two opposite directions. We avoid this difficulty by training identical statistical translation models for both translation directions using the same training data. We investigate information retrieval between English and French, incorporating both translations directions into both document translation and query translation-based information retrieval, as well as into hybrid systems. We find that hybrids of document and query translation-based systems out-perform query translation systems, even human-quality query translation systems.

[1]  Robert L. Mercer,et al.  Aligning Sentences in Parallel Corpora , 1991, ACL.

[2]  Salim Roukos,et al.  TREC-5 Ad Hoc Retrieval Using K Nearest-Neighbors Re-Scoring , 1996, TREC.

[3]  Yiming Yang,et al.  Translingual Information Retrieval: A Comparative Evaluation , 1997, IJCAI.

[4]  W. Bruce Croft,et al.  Phrasal translation and query expansion techniques for cross-language information retrieval , 1997, SIGIR '97.

[5]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[6]  Salim Roukos,et al.  Ad hoc and Multilingual Information Retrieval at IBM , 1998, TREC.

[7]  BuckleyChris,et al.  Using clustering and SuperConcepts within SMART , 2000 .

[8]  W. Bruce Croft,et al.  Query expansion using local and global document analysis , 1996, SIGIR '96.

[9]  Salim Roukos,et al.  TREC-6 Ad-Hoc Retrieval , 1997, TREC.

[10]  Douglas W. Oard,et al.  A comparative study of query and document translation for cross-language information retrieval , 1998, AMTA.

[11]  Elizabeth D. Liddy,et al.  TREC-7 Evaluation of Conceptual Interlingua Document Retrieval (CINDOR) in English and French , 1998, TREC.

[12]  Michael L. Littman,et al.  Automatic Cross-Language Retrieval Using Latent Semantic Indexing , 1997 .

[13]  W. Bruce Croft,et al.  Resolving ambiguity for cross-language retrieval , 1998, SIGIR '98.

[14]  Douglas W. Oard,et al.  Document Translation for Cross-Language Text Retrieval at the University of Maryland , 1997, TREC.

[15]  Claire Cardie,et al.  Using clustering and SuperConcepts within SMART: TREC 6 , 1997, Inf. Process. Manag..

[16]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[17]  Stephen E. Robertson,et al.  GatfordCentre for Interactive Systems ResearchDepartment of Information , 1996 .

[18]  Salim Roukos,et al.  Fast document translation for cross-language information retrieval , 1998, AMTA.