论文信息 - Choosing the Right Translation: A Syntactically Informed Classification Approach

Choosing the Right Translation: A Syntactically Informed Classification Approach

One style of Multi-Engine Machine Translation architecture involves choosing the best of a set of outputs from different systems. Choosing the best translation from an arbitrary set, even in the presence of human references, is a difficult problem; it may prove better to look at mechanisms for making such choices in more restricted contexts. In this paper we take a classification-based approach to choosing between candidates from syntactically informed translations. The idea is that using multiple parsers as part of a classifier could help detect syntactic problems in this context that lead to bad translations; these problems could be detected on either the source side---perhaps sentences with difficult or incorrect parses could lead to bad translations---or on the target side---perhaps the output quality could be measured in a more syntactically informed way, looking for syntactic abnormalities. We show that there is no evidence that the source side information is useful. However, a target-side classifier, when used to identify particularly bad translation candidates, can lead to significant improvements in BLEU score. Improvements are even greater when combined with existing language and alignment model approaches.

Mark Dras | Simon Zwarts

[1] Tadashi Nomoto. Multi-Engine Machine Translation with Voted Language Model , 2004, ACL.

[2] Chao Wang,et al. Chinese Syntactic Reordering for Statistical Machine Translation , 2007, EMNLP.

[3] Sergei Nirenburg,et al. Three Heads are Better than One , 1994, ANLP.

[4] Philipp Koehn,et al. Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[5] Stefan Riezler,et al. Grammatical Machine Translation , 2006, NAACL.

[6] Mark Dras,et al. This Phrase-Based SMT System is Out of Order: Generalised Word Reordering in Machine Translation , 2006, ALTA.

[7] Philipp Koehn,et al. Clause Restructuring for Statistical Machine Translation , 2005, ACL.

[8] John D. Lafferty,et al. A Robust Parsing Algorithm for Link Grammars , 1995, IWPT.

[9] Tadashi Nomoto. Predictive models of performance in multi-engine machine translation , 2003, MTSUMMIT.

[10] Stephen Wan,et al. GLEU: Automatic Evaluation of Sentence-Level Fluency , 2007, ACL.

[11] Philipp Koehn,et al. Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.