Accuracy-Based Scoring for Phrase-Based Statistical Machine Translation

Although the scoring features of state-of-the-art Phrase-Based Statistical Machine Translation (PB-SMT) models are weighted so as to optimise an objective function measuring translation quality, the estimation of the features themselves does not have any relation to such quality metrics. In this paper, we introduce a translation quality-based feature to PBSMT in a bid to improve the translation quality of the system. Our feature is estimated by averaging the edit-distance between phrase pairs involved in the translation of oracle sentences, chosen by automatic evaluation metrics from the N-best outputs of a baseline system, and phrase pairs occurring in the N-best list. Using our method, we report a statistically significant 2.11% relative improvement in BLEU score for the WMT 2009 Spanish-to-English translation task. We also report that using our method we can achieve statistically significant improvements over the baseline using many other MT evaluation metrics, and a substantial increase in speed and reduction in memory use (due to a reduction in phrase-table size of 87%) while maintaining significant gains in translation quality.

[1]  François Yvon,et al.  Minimum Error Rate Training Semiring , 2011, EAMT.

[2]  Philipp Koehn,et al.  Explorer Edinburgh System Description for the 2005 IWSLT Speech Translation Evaluation , 2005 .

[3]  David A. Smith,et al.  Minimum Risk Annealing for Training Log-Linear Models , 2006, ACL.

[4]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[5]  Khalil Sima'an,et al.  A Consistent and Efficient Estimator for Data-Oriented Parsing , 2005, J. Autom. Lang. Comb..

[6]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[7]  Christopher M. Bishop,et al.  Classification and regression , 1997 .

[8]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[9]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[10]  Ben Taskar,et al.  An End-to-End Discriminative Approach to Machine Translation , 2006, ACL.

[11]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[12]  Joseph P. Turian,et al.  Evaluation of machine translation and its evaluation , 2003, MTSUMMIT.

[13]  Taro Watanabe,et al.  Online Large-Margin Training for Statistical Machine Translation , 2007, EMNLP.

[14]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[15]  Timothy R. Anderson,et al.  The MIT-LL/AFRL IWSLT-2006 MT system , 2006, IWSLT.

[16]  Arjen Poutsma Data-Oriented Translation , 2000, COLING.

[17]  Fred J. Damerau,et al.  A technique for computer detection and correction of spelling errors , 1964, CACM.

[18]  Andy Way,et al.  Seeing the wood for the trees: data-oriented translation , 2003, MTSUMMIT.

[19]  C. Laymon A. study , 2018, Predication and Ontology.

[20]  Andy Way,et al.  Accuracy-Based Scoring for DOT: Towards Direct Error Minimization for Data-Oriented Translation , 2009, EMNLP.

[21]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[22]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[23]  Francisco Casacuberta,et al.  Statistical Phrase-Based Models for Interactive Computer-Assisted Translation , 2006, ACL.

[24]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[25]  Philipp Koehn,et al.  Edinburgh System Descriptionfor the 2005 NIST MT Evaluation , 2005 .

[26]  Philip Resnik,et al.  Online Large-Margin Training of Syntactic and Structural Translation Features , 2008, EMNLP.

[27]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[28]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[29]  Philipp Koehn,et al.  Online learning methods for discriminative training of phrase based statistical machine translation , 2007, MTSUMMIT.

[30]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[31]  Tong Zhang,et al.  A Discriminative Global Training Algorithm for Statistical MT , 2006, ACL.