Sensible: L2 Translation Assistance by Emulating the Manual Post-Editing Process

This paper describes the Post-Editor Z system submitted to the L2 writing assistant task in SemEval-2014. The aim of task is to build a translation assistance system to translate untranslated sentence fragments. This is not unlike the task of post-editing where human translators improve machine-generated translations. Post-Editor Z emulates the manual process of post-editing by (i) crawling and extracting parallel sentences that contain the untranslated fragments from a Web-based translation memory, (ii) extracting the possible translations of the fragments indexed by the translation memory and (iii) applying simple cosine-based sentence similarity to rank possible translations for the untranslated fragment.

[1]  Kevin Knight,et al.  Automated Postediting of Documents , 1994, AAAI.

[2]  Chris Callison-Burch,et al.  Improved Statistical Machine Translation Using Monolingually-Derived Paraphrases , 2009, EMNLP.

[3]  Philipp Koehn,et al.  Improved Statistical Machine Translation Using Paraphrases , 2006, NAACL.

[4]  Jason S. Chang,et al.  Using Sublexical Translations to Handle the OOV Problem in Machine Translation , 2011, TALIP.

[5]  Antal van den Bosch,et al.  WSD2: Parameter optimisation for Memory-based Cross-Lingual Word-Sense Disambiguation , 2013, SemEval@NAACL-HLT.

[6]  Rudolf Rosa,et al.  Two-step translation with grammatical post-processing , 2011, WMT@EMNLP.

[7]  Marianna Apidianaki LIMSI : Cross-lingual Word Sense Disambiguation using Translation Sense Clustering , 2013, SemEval@NAACL-HLT.

[8]  Michael Gasser,et al.  HLTDI: CL-WSD Using Markov Random Fields for SemEval-2013 Task 10 , 2013, *SEMEVAL.

[9]  Gholamreza Haffari,et al.  Graph Propagation for Paraphrasing Out-of-Vocabulary Words in Statistical Machine Translation , 2013, ACL.

[10]  Francis Bond,et al.  XLING: Matching Query Sentences to a Parallel Corpus using Topic Models for WSD , 2013, SemEval@NAACL-HLT.

[11]  Marine Carpuat,et al.  NRC: A Machine Translation Approach to Cross-Lingual Word Sense Disambiguation (SemEval-2013 Task 10) , 2013, *SEMEVAL.

[12]  Santanu Pal,et al.  Manawi: Using Multi-Word Expressions and Named Entities to Improve Machine Translation , 2014, WMT@ACL.

[13]  Nizar Habash,et al.  Four Techniques for Online Handling of Out-of-Vocabulary Words in Arabic-English Statistical Machine Translation , 2008, ACL.