论文信息 - Facilitating Translation Using Source Language Paraphrase Lattices

Facilitating Translation Using Source Language Paraphrase Lattices

For resource-limited language pairs, coverage of the test set by the parallel corpus is an important factor that affects translation quality in two respects: 1) out of vocabulary words; 2) the same information in an input sentence can be expressed in different ways, while current phrase-based SMT systems cannot automatically select an alternative way to transfer the same information. Therefore, given limited data, in order to facilitate translation from the input side, this paper proposes a novel method to reduce the translation difficulty using source-side lattice-based paraphrases. We utilise the original phrases from the input sentence and the corresponding paraphrases to build a lattice with estimated weights for each edge to improve translation quality. Compared to the baseline system, our method achieves relative improvements of 7.07%, 6.78% and 3.63% in terms of BLEU score on small, medium and large-scale English-to-Chinese translation tasks respectively. The results show that the proposed method is effective not only for resource-limited language pairs, but also for resource-sufficient pairs to some extent.

[1] Philipp Koehn,et al. Word Lattices for Multi-Source Translation , 2009, EACL.

[2] Ying Zhang,et al. Measuring confidence intervals for the machine translation evaluation metrics , 2004, TMI.

[3] Chris Callison-Burch,et al. Paraphrasing with Bilingual Parallel Corpora , 2005, ACL.

[4] Andreas Zollmann,et al. Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[5] Nitin Madnani,et al. Fluency, Adequacy, or HTER? Exploring Different Human Judgments with a Tunable MT Metric , 2009, WMT@EACL.

[6] Philipp Koehn,et al. Improved Statistical Machine Translation Using Paraphrases , 2006, NAACL.

[7] Preslav Nakov,et al. Improving English-Spanish Statistical Machine Translation: Experiments in Domain Adaptation, Sentence Paraphrasing, Tokenization, and Recasing , 2008, WMT@ACL.

[8] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[9] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[10] Ralph Weischedel,et al. A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[11] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[12] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[13] Matthew G. Snover,et al. A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[14] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15] David Chiang,et al. A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.