Verdi: Quality Estimation and Error Detection for Bilingual Corpora

Translation Quality Estimation is critical to reducing post-editing efforts in machine translation and to cross-lingual corpus cleaning. As a research problem, quality estimation (QE) aims to directly estimate the quality of translation in a given pair of source and target sentences, and highlight the words that need corrections, without referencing to golden translations. In this paper, we propose Verdi, a novel framework for word-level and sentence-level post-editing effort estimation for bilingual corpora. Verdi adopts two word predictors to enable diverse features to be extracted from a pair of sentences for subsequent quality estimation, including a transformer-based neural machine translation (NMT) model and a pre-trained cross-lingual language model (XLM). We exploit the symmetric nature of bilingual corpora and apply model-level dual learning in the NMT predictor, which handles a primal task and a dual task simultaneously with weight sharing, leading to stronger context prediction ability than single-direction NMT models. By taking advantage of the dual learning scheme, we further design a novel feature to directly encode the translated target information without relying on the source context. Extensive experiments conducted on WMT20 QE tasks demonstrate that our method beats the winner of the competition and outperforms other baseline methods by a great margin. We further use the sentence-level scores provided by Verdi to clean a parallel corpus and observe benefits on both model performance and training efficiency.

[1]  Hermann Ney,et al.  Word-Level Confidence Estimation for Machine Translation , 2007, CL.

[2]  Lucia Specia,et al.  SHEF-Lite: When Less is More for Translation Quality Estimation , 2013, WMT@ACL.

[3]  André F. T. Martins,et al.  Unbabel's Participation in the WMT19 Translation Quality Estimation Shared Task , 2019, WMT.

[4]  Gerhard Nahler,et al.  Pearson Correlation Coefficient , 2020, Definitions.

[5]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[6]  Nenghai Yu,et al.  Dual Supervised Learning , 2017, ICML.

[7]  Lucia Specia,et al.  QuEst - A translation quality estimation framework , 2013, ACL.

[8]  Anton Frolov,et al.  YSDA Participation in the WMT’16 Quality Estimation Shared Task , 2016, WMT.

[9]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[10]  Kai Fan,et al.  "Bilingual Expert" Can Find Translation Errors , 2018, AAAI.

[11]  Lucia Specia,et al.  A Bayesian non-linear method for feature selection in machine translation quality estimation , 2015, Machine Translation.

[12]  Marc'Aurelio Ranzato,et al.  Mixture Models for Diverse Machine Translation: Tricks of the Trade , 2019, ICML.

[13]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[14]  Stefan Riezler,et al.  QUality Estimation from ScraTCH (QUETCH): Deep Learning for Word-level Translation Quality Estimation , 2015, WMT@EMNLP.

[15]  Myle Ott,et al.  fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[16]  André F. T. Martins,et al.  OpenKiwi: An Open Source Framework for Quality Estimation , 2019, ACL.

[17]  Topic Models for Translation Quality Estimation for Gisting Purposes , 2013, MTSUMMIT.

[18]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[19]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[20]  Wanxiang Che,et al.  N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models , 2020, ArXiv.

[21]  Francisco Casacuberta,et al.  PRHLT Submission to the WMT12 Quality Estimation Task , 2012, WMT@NAACL-HLT.

[22]  Ramón Fernández Astudillo,et al.  Unbabel's Participation in the WMT16 Word-Level Translation Quality Estimation Shared Task , 2016, WMT.

[23]  B. Matthews Comparison of the predicted and observed secondary structure of T4 phage lysozyme. , 1975, Biochimica et biophysica acta.

[24]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[25]  Ramón Fernández Astudillo,et al.  Pushing the Limits of Translation Quality Estimation , 2017, TACL.

[26]  M. Sasikumar,et al.  Translation Quality Estimation using Recurrent Neural Network , 2016, WMT.

[27]  Tie-Yan Liu,et al.  Dual Learning for Machine Translation , 2016, NIPS.

[28]  Lucia Specia,et al.  Linguistic Features for Quality Estimation , 2012, WMT@NAACL-HLT.

[29]  Taro Watanabe,et al.  Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection , 2018, WMT.