论文信息 - Tighter Integration of Rule-Based and Statistical MT in Serial System Combination

Tighter Integration of Rule-Based and Statistical MT in Serial System Combination

Recent papers have described machine translation (MT) based on an automatic post-editing or serial combination strategy whereby the input language is first translated into the target language by a rule-based MT (RBMT) system, then the target language output is automatically post-edited by a phrase-based statistical machine translation (SMT) system. This approach has been shown to improve MT quality over RBMT or SMT alone. In this previous work, there was a very loose coupling between the two systems: the SMT system only had access to the final 1-best translations from RBMT. Furthermore, the previous work involved European language pairs and relatively small training corpora. In this paper, we describe a more tightly integrated serial combination for the Chinese-to-English MT task. We will present experimental evaluation results on the 2008 NIST constrained data track where a significant gain in terms of both automatic and subjective metrics is achieved through the tighter coupling of the two systems.

[1] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[2] Hermann Ney,et al. Computing Consensus Translation for Multiple Machine Translation Systems Using Enhanced Hypothesis Alignment , 2006, EACL.

[3] David Chiang,et al. Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[4] Roland Kuhn,et al. Rule-Based Translation with Statistical Phrase-Based Post-Editing , 2007, WMT@ACL.

[5] Rémi Zajac,et al. SYSTRAN's Chinese Word Segmentation , 2003, SIGHAN.

[6] Michel Simard,et al. NRC‘s PORTAGE System for WMT 2007 , 2007, WMT@ACL.

[7] Richard M. Schwartz,et al. Improved Word-Level System Combination for Machine Translation , 2007, ACL.

[8] Philipp Koehn,et al. (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[9] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[10] Roland Kuhn,et al. Mixture-Model Adaptation for SMT , 2007, WMT@ACL.

[11] Philipp Koehn,et al. Statistical Post-Editing on SYSTRAN‘s Rule-Based Translation System , 2007, WMT@ACL.