Hybrid Machine Translation Guided by a Rule–Based System

This paper presents a machine translation architecture which hybridizes Matxin, a rulebased system, with regular phrase-based Statistical Machine Translation. In short, the hybrid translation process is guided by the rulebased engine and, before transference, a set of partial candidate translations provided by SMT subsystems is used to enrich the treebased representation. The final hybrid translation is created by choosing the most probable combination among the available fragments with a statistical decoder in a monotonic way. We have applied the hybrid model to a pair of distant languages, Spanish and Basque, and according to our evaluation (both automatic and manual) the hybrid approach significantly outperforms the best SMT system on out-of-domain data.

[1]  Hans Uszkoreit,et al.  Further Experiments with Shallow Hybrid MT Systems , 2010, WMT@ACL.

[2]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[3]  John A. Toomey,et al.  Further Experiments with B. Pertussis Filtrate , 1933 .

[4]  G. Thurmair Comparing different architectures of hybrid Machine Translation systems , 2009, MTSUMMIT.

[5]  Nizar Habash,et al.  Symbolic-to-statistical hybridization: extending generation-heavy machine translation , 2009, Machine Translation.

[6]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[7]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[8]  Andreas Eisele,et al.  Using Moses to Integrate Multiple Rule-Based Machine Translation Engines into a Hybrid System , 2008, WMT@ACL.

[9]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[10]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[11]  Yu Chen,et al.  Hierarchical Hybrid Translation between English and German , 2010, EAMT.

[12]  Gorka Labaka,et al.  Transfer-Based MT from Spanish into Basque: Reusability, Standardization and Open Source , 2007, CICLing.

[13]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[14]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .