The UA-Prompsit hybrid machine translation system for the 2014 Workshop on Statistical Machine Translation

This paper describes the system jointly developed by members of the Departament de Llenguatges i Sistemes Informatics at Universitat d’Alacant and the Prompsit Language Engineering company for the shared translation task of the 2014 Workshop on Statistical Machine Translation. We present a phrase-based statistical machine translation system whose phrase table is enriched with information obtained from dictionaries and shallowtransfer rules like those used in rule-based machine translation. The novelty of our approach lies in the fact that the transfer rules used were not written by humans, but automatically inferred from a parallel cor-

[1]  Mikel L. Forcada,et al.  Automatic induction of bilingual resources from aligned parallel corpora: application to shallow-transfer machine translation , 2007, Machine Translation.

[2]  Philipp Koehn,et al.  Factored Translation Models , 2007, EMNLP.

[3]  Stefan Riezler,et al.  Grammatical Machine Translation , 2006, NAACL.

[4]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[5]  Philipp Koehn,et al.  Findings of the 2011 Workshop on Statistical Machine Translation , 2011, WMT@EMNLP.

[6]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[7]  Víctor M. Sánchez-Cartagena,et al.  A generalised alignment template formalism and its application to the inference of shallow-transfer machine translation rules from scarce bilingual corpora , 2015, Comput. Speech Lang..

[8]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[9]  Josef van Genabith,et al.  Factor templates for factored machine translation models , 2010, IWSLT.

[10]  Holger Schwenk,et al.  SMT and SPE Machine Translation Systems for WMT‘09 , 2009, WMT@EACL.

[11]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[12]  Stanley F. Chen,et al.  An empirical study of smoothing techniques for language modeling , 1999 .

[13]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[14]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[15]  Ondrej Bojar,et al.  Phrase-Based and Deep Syntactic English-to-Czech Statistical Machine Translation , 2008, WMT@ACL.

[16]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[17]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[18]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[19]  Francis M. Tyers Rule-Based Augmentation of Training Data in Breton-French Statistical Machine Translation , 2009, EAMT.

[20]  Matthew J. Saltzman,et al.  Computational Experience with a Software Framework for Parallel Integer Programming , 2009, INFORMS J. Comput..

[21]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[22]  G. Nemhauser,et al.  Integer Programming , 2020 .

[23]  Mikel L. Forcada,et al.  Inferring Shallow-Transfer Machine Translation Rules from Small Parallel Corpora , 2014, J. Artif. Intell. Res..

[24]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[25]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[26]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[27]  Andreas Eisele,et al.  Using Moses to Integrate Multiple Rule-Based Machine Translation Engines into a Hybrid System , 2008, WMT@ACL.

[28]  Marta R. Costa-jussà,et al.  Statistical machine translation enhancements through linguistic levels: A survey , 2014, CSUR.

[29]  Francis M. Tyers,et al.  Apertium: a free/open-source platform for rule-based machine translation , 2011, Machine Translation.

[30]  Víctor M. Sánchez-Cartagena,et al.  Integrating shallow-transfer rules into phrase-based statistical machine translation , 2011, MTSUMMIT.

[31]  Víctor M. Sánchez-Cartagena,et al.  The Universitat d'Alacant hybrid machine translation system for WMT 2011 , 2011, WMT@EMNLP.