SmartMATE : Online Self-Serve Access to State-ofthe-Art SMT

Access to good quality Machine Translation (MT) has never been as easy as it is today. Portals such as Google Translate and Bing Translator facilitate huge amounts of translation requests on a daily basis, for an ever increasing spectrum of language pairs. People are finding many uses for the raw MT output provided by these, and other, freely available engines on the web, including gisting, assimilation, first drafts of translations for dissemination, etc. However, each of these systems is a ’one-size-fits-all’ solution, where no customization is available to the user. One alternative is to purchase a system, which may be overly expensive, or sub-optimal for the type of documentation required to be translated. Another alternative is to install a freely available system such as Moses, but this may prove unduly onerous for the naı̈ve user. In this paper, we present a portal which facilitates selfserve MT using state-of-the-art statistical MT (SMT). This is currently free for anyone to access and personalise their system by uploading their own Translation Memory(TM) and glossaries. By means of a simple key-press, optimal training, development and test data are created on-thefly, which are then used for an automatic system build, with the results published in very acceptable amounts of time, together with automatic evaluation scores. According to user trials, this – together with builtin TM and online editing functionality – is a very exciting development, with the potential to vastly expand the user-base for self-serve SMT on a global basis.

[1]  Saturnino Luz,et al.  Translation practice in the workplace: contextual analysis and implications for machine translation , 2011, Machine Translation.

[2]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[3]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[4]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[5]  Andy Way,et al.  Towards Using Web-Crawled Data for Domain Adaptation in Statistical Machine Translation , 2011, EAMT.

[6]  Andy Way,et al.  Recent Advances in Example-Based Machine Translation , 2004 .

[7]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[8]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[9]  Andy Way,et al.  On the Role of Translations in State-of-the-Art Statistical Machine Translation , 2011, Lang. Linguistics Compass.

[10]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[11]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[12]  Andy Way,et al.  Statistical Machine Translation: A Guide for Linguists and Translators , 2011, Lang. Linguistics Compass.

[13]  Jakob Uszkoreit,et al.  “Poetic” Statistical Machine Translation: Rhyme and Meter , 2010, EMNLP.

[14]  Salim Roukos,et al.  Direct Translation Model 2 , 2007, HLT-NAACL.

[15]  Philipp Koehn,et al.  Statistical Post-Editing on SYSTRAN‘s Rule-Based Translation System , 2007, WMT@ACL.

[16]  S. D. Pietra,et al.  A statistical approach to French/English translation , 1988, TMI.

[17]  Lucia Specia,et al.  Exploiting Objective Annotations for Minimising Translation Post-editing Effort , 2011, EAMT.

[18]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.