Quality Estimation Using Dual Encoders with Transfer Learning

This paper describes POSTECH’s quality estimation systems submitted to Task 2 of the WMT 2021 quality estimation shared task: Word and Sentence-Level Post-editing Effort. We notice that it is possible to improve the stability of the latest quality estimation models that have only one encoder based on the self-attention mechanism to simultaneously process both of the two input data, a source sequence and its machine translation, in that such models have neglected to take advantage of pre-trained monolingual representations, which are generally accepted as reliable representations for various natural language processing tasks. Therefore, our model uses two pre-trained monolingual encoders and then exchanges the information of two encoded representations through two additional cross attention networks. According to the official leaderboard, our systems outperform the baseline systems in terms of the Matthews correlation coefficient for machine translations’ word-level quality estimation and in terms of the Pearson’s correlation coefficient for sentence-level quality estimation by 0.4126 and 0.5497 respectively.

[1]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[2]  Jingbo Zhu,et al.  The NiuTrans System for the WMT20 Quality Estimation Shared Task , 2020, WMT@EMNLP.

[3]  Dongjun Lee,et al.  Two-Phase Cross-Lingual Language Model Fine-Tuning for Machine Translation Quality Estimation , 2020, WMT.

[4]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[5]  Fabio Kepler,et al.  IST-Unbabel Participation in the WMT20 Quality Estimation Shared Task , 2020, WMT@EMNLP.

[6]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[7]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[8]  Mamoru Komachi,et al.  TMUOU Submission for WMT20 Quality Estimation Shared Task , 2020, WMT.

[9]  Marco Turchi,et al.  ESCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing , 2018, LREC.

[10]  Lucia Specia,et al.  Findings of the WMT 2020 Shared Task on Quality Estimation , 2020, WMT.

[11]  Raphael Rubino NICT Kyoto Submission for the WMT'20 Quality Estimation Task: Intermediate Training for Domain and Task Adaptation , 2020, WMT@EMNLP.

[12]  Alex Kulesza,et al.  Confidence Estimation for Machine Translation , 2004, COLING.

[13]  Yulin Zhang,et al.  Tencent submission for WMT20 Quality Estimation Shared Task , 2020, WMT.

[14]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[15]  Frank Hutter,et al.  Decoupled Weight Decay Regularization , 2017, ICLR.

[16]  Nello Cristianini,et al.  Estimating the Sentence-Level Quality of Machine Translation Systems , 2009, EAMT.

[17]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.

[18]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[19]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[20]  Lucia Specia,et al.  BERGAMOT-LATTE Submissions for the WMT20 Quality Estimation Shared Task , 2020, WMT.

[21]  Quoc V. Le,et al.  ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators , 2020, ICLR.

[22]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .