Combining bidirectional translation and synonymy for cross-language information retrieval

This paper introduces a general framework for the use of translation probabilities in cross-language information retrieval based on the notion that information retrieval fundamentally requires matching what the searcher means with what the author of a document meant. That perspective yields a computational formulation that provides a natural way of combining what have been known as query and document translation. Two well-recognized techniques are shown to be a special case of this model under restrictive assumptions. Cross-language search results are reported that are statistically indistinguishable from strong monolingual baselines for both French and Chinese documents.

[1]  J. Scott McCarley Should we Translate the Documents or the Queries in Cross-language Information Retrieval? , 1999, ACL.

[2]  Wessel Kraaij,et al.  Variations on language modeling for information retrieval , 2005, SIGF.

[3]  Mohand Boughanem,et al.  Investigation on Disambiguation in CLIR: Aligned Corpus and Bi-directional Translation-Based Strategies , 2001, CLEF.

[4]  Alexander M. Fraser,et al.  TREC 2001 Cross-lingual Retrieval at BBN , 2001, TREC.

[5]  ResnikPhilip,et al.  Distinguishing systems and distinguishing senses: new evaluation methods for Word Sense Disambiguation , 1999 .

[6]  Jacques Savoy Report on CLEF-2001 Experiments: Effective Combined Query-Translation Approach , 2001, CLEF.

[7]  Nitin Madnani,et al.  The Hiero Machine Translation System: Extensions, Evaluation, and Analysis , 2005, HLT.

[8]  Jong-Hyeok Lee,et al.  POSTECH at NTCIR-4: CJKE Monolingual and Korean-related Cross- Language Retrieval Experiments , 2004 .

[9]  Douglas W. Oard,et al.  Translation-Based Indexing for Cross-Language Retrieval , 2002, ECIR.

[10]  Jianqiang Wang,et al.  Matching Meaning for Cross-Language Information Retrieval , 2012, Inf. Process. Manag..

[11]  Douglas W. Oard,et al.  Probabilistic structured query methods , 2003, SIGIR.

[12]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[13]  Martin Braschler Combination Approaches for Multilingual Text Retrieval , 2004, Information Retrieval.

[14]  Jinxi Xu,et al.  TREC-9 Cross-lingual Retrieval at BBN , 2000, TREC.

[15]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[16]  Ari Pirkola,et al.  The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval , 1998, SIGIR '98.

[17]  Kui-Lam Kwok Exploiting a Chinese-English bilingual wordlist for English-Chinese cross language information retrieval , 2000, IRAL '00.

[18]  K. Sparck Jones,et al.  Simple, proven approaches to text retrieval , 1994 .