Predicting Translation Performance with Referential Translation Machines

Referential translation machines achieve top performance in both bilingual and monolingual settings without accessing any task or domain specific information or resource. RTMs achieve the 3rd system results for German to English sentence-level prediction of translation quality and the 2nd system results according to root mean squared error. In addition to the new features about substring distances, punctuation tokens, character n-grams, and alignment crossings, and additional learning models, we average prediction scores from different models using weights based on their training performance for improved results.

[1]  Maja Popovic,et al.  chrF: character n-gram F-score for automatic MT evaluation , 2015, WMT@EMNLP.

[2]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[3]  Philipp Koehn,et al.  Findings of the 2012 Workshop on Statistical Machine Translation , 2012, WMT@NAACL-HLT.

[4]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[5]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[6]  Man Lan,et al.  ECNU at SemEval-2017 Task 1: Leverage Kernel-based Traditional NLP features and Neural Networks to Build a Universal Model for Multilingual and Cross-lingual Semantic Textual Similarity , 2017, SemEval@ACL.

[7]  Mehmet Ergun Biçici,et al.  The Regression Model of Machine Translation , 2012 .

[8]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[9]  Sara Stymne,et al.  The UU Submission to the Machine Translation Quality Estimation Task , 2016, WMT.

[10]  Ergun Biçici,et al.  ParFDA for Instance Selection for Statistical Machine Translation , 2016, WMT.

[11]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[12]  Anton Frolov,et al.  YSDA Participation in the WMT’16 Quality Estimation Shared Task , 2016, WMT.

[13]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[14]  Ergun Biçici,et al.  Referential Translation Machines for Predicting Translation Performance , 2016, WMT.

[15]  Unsupervised and Transfer Learning - Workshop held at ICML 2011, Bellevue, Washington, USA, July 2, 2011 , 2012, ICML Unsupervised and Transfer Learning.

[16]  Pierre Geurts,et al.  Extremely randomized trees , 2006, Machine Learning.

[17]  Aixia Guo,et al.  Gene Selection for Cancer Classification using Support Vector Machines , 2014 .

[18]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[19]  Ergun Biçici RTM at SemEval-2017 Task 1: Referential Translation Machines for Predicting Semantic Similarity , 2017, SemEval@ACL.

[20]  Andy Way,et al.  Referential translation machines for predicting semantic similarity , 2016, Lang. Resour. Evaluation.

[21]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.