论文信息 - MT-ComparEval: Graphical evaluation interface for Machine Translation development

MT-ComparEval: Graphical evaluation interface for Machine Translation development

Abstract The tool described in this article has been designed to help MT developers by implementing a web-based graphical user interface that allows to systematically compare and evaluate various MT engines/experiments using comparative analysis via automatic measures and statistics. The evaluation panel provides graphs, tests for statistical significance and n-gram statistics. We also present a demo server http://wmt.ufal.cz with WMT14 and WMT15 translations.

[1] Lluís Màrquez i Villodre,et al. Asiya: An Open Toolkit for Automatic Machine Translation (Meta-)Evaluation , 2010, Prague Bull. Math. Linguistics.

[2] Ondrej Bojar,et al. Addicter: What Is Wrong with My Translations? , 2011, Prague Bull. Math. Linguistics.

[3] Philipp Koehn,et al. Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[4] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[5] Alon Lavie,et al. Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability , 2011, ACL.

[6] Lucia Specia,et al. PET: a Tool for Post-editing and Assessing Machine Translation , 2012, LREC.

[7] Christian Federmann,et al. Appraise: An Open-Source Toolkit for Manual Phrase-Based Evaluation of Translations , 2010, LREC.

[8] Maria Leonor Pacheco,et al. of the Association for Computational Linguistics: , 2001 .

[9] Stefan Riezler,et al. On Some Pitfalls in Automatic Evaluation and Significance Testing for MT , 2005, IEEvaluation@ACL.

[10] Khalid Choukri,et al. The european language resources association , 1998, LREC.

[11] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12] Philipp Koehn,et al. An Experimental Management System , 2010, Prague Bull. Math. Linguistics.

[13] Marcello Federico,et al. MT-EQuAl: a Toolkit for Human Assessment of Machine Translation Output , 2014, COLING.

[14] Chris Callison-Burch,et al. Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[15] Philipp Koehn,et al. (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[16] Maja Popovic. Hjerson: An Open Source Tool for Automatic Error Classification of Machine Translation Output , 2011, Prague Bull. Math. Linguistics.

[17] Ondrej Bojar,et al. Automatic MT Error Analysis: Hjerson Helping Addicter , 2012, LREC.

[18] Philipp Koehn,et al. Findings of the 2015 Workshop on Statistical Machine Translation , 2015, WMT@EMNLP.