论文信息 - The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task

The JHU-Microsoft Submission for WMT21 Quality Estimation Shared Task

This paper presents the JHU-Microsoft joint submission for WMT 2021 quality estimation shared task. We only participate in Task 2 (post-editing effort estimation) of the shared task, focusing on the target-side word-level quality estimation. The techniques we experimented with include Levenshtein Transformer training and data augmentation with a combination of forward, backward, round-trip translation, and pseudo post-editing of the MT output. We demonstrate the competitiveness of our system compared to the widely adopted OpenKiwi-XLM baseline. Our system is also the top-ranking system on the MT MCC metric for the English-German language pair.

Philipp Koehn | Matt Post | Marcin Junczys-Dowmunt | Christian Federmann | Shuoyang Ding

[1] Joel Nothman,et al. SciPy 1.0-Fundamental Algorithms for Scientific Computing in Python , 2019, ArXiv.

[2] Myle Ott,et al. Facebook FAIR’s WMT19 News Translation Task Submission , 2019, WMT.

[3] Dongjun Lee,et al. Cross-Lingual Transformers for Neural Automatic Post-Editing , 2020, WMT.

[4] Jaemin Jo,et al. IntelliCAT: Intelligent Machine Translation Post-Editing with Quality Estimation and Translation Suggestion , 2021, ACL.

[5] M. J. D. Powell,et al. An efficient method for finding the minimum of a function of several variables without calculating derivatives , 1964, Comput. J..

[6] Jiawei Zhou,et al. Improving Non-autoregressive Neural Machine Translation with Monolingual Data , 2020, ACL.

[7] Fabio Kepler,et al. IST-Unbabel Participation in the WMT20 Quality Estimation Shared Task , 2020, WMT@EMNLP.

[8] Changhan Wang,et al. Levenshtein Transformer , 2019, NeurIPS.

[9] Philipp Koehn,et al. Levenshtein Training for Word-level Quality Estimation , 2021, EMNLP.

[10] John A. Nelder,et al. A Simplex Method for Function Minimization , 1965, Comput. J..

[11] Shiliang Sun,et al. HW-TSC’s Participation at WMT 2020 Quality Estimation Shared Task , 2020, WMT.

[12] Holger Schwenk,et al. Beyond English-Centric Multilingual Machine Translation , 2020, J. Mach. Learn. Res..

[13] Dongjun Lee,et al. Two-Phase Cross-Lingual Language Model Fine-Tuning for Machine Translation Quality Estimation , 2020, WMT.

[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[16] André F. T. Martins,et al. OpenKiwi: An Open Source Framework for Quality Estimation , 2019, ACL.