论文信息 - DOCUMENT TRANSLATION RETRIEVAL BASED ON STATISTICAL MACHINE TRANSLATION TECHNIQUES

DOCUMENT TRANSLATION RETRIEVAL BASED ON STATISTICAL MACHINE TRANSLATION TECHNIQUES

We compare different strategies to apply statistical machine translation techniques in order to retrieve documents that are a plausible translation of a given source document. Finding the translated version of a document is a relevant task; for example, when building a corpus of parallel texts that can help to create and evaluate new machine translation systems. In contrast to the traditional settings in cross-language information retrieval tasks, in this case both the source and the target text are long and, thus, the procedure used to select which words or phrases will be included in the query has a key effect on the retrieval performance. In the statistical approach explored here, both the probability of the translation and the relevance of the terms are taken into account in order to build an effective query.

Felipe Sánchez-Martínez | Rafael C. Carrasco

[1] Adam Kilgarriff,et al. Introduction to the Special Issue on the Web as Corpus , 2003, CL.

[2] Philipp Koehn,et al. Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[3] John Cocke,et al. A Statistical Approach to Machine Translation , 1990, CL.

[4] J. Scott McCarley. Should we Translate the Documents or the Queries in Cross-language Information Retrieval? , 1999, ACL.

[5] A. Einstein,et al. Über den Einfluß der Schwerkraft auf die Ausbreitung des Lichtes , 1911 .

[6] Michael J. Cafarella,et al. Building Nutch: Open Source Search , 2004, ACM Queue.

[7] Christopher D. Manning,et al. Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[8] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[9] Djoerd Hiemstra,et al. Disambiguation Strategies for Cross-Language Information Retrieval , 1999, ECDL.

[10] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[11] Gregory Grefenstette,et al. Cross-Language Information Retrieval , 1998, The Springer International Series on Information Retrieval.