论文信息 - MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset - 字舞流文

MLQE-PE: A Multilingual Quality Estimation and Post-Editing Dataset

We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE) and Automatic Post-Editing (APE). The dataset contains seven language pairs, with human labels for 9,000 translations per language pair in the following formats: sentence-level direct assessments and post-editing effort, and word-level good/bad labels. It also contains the post-edited sentences, as well as titles of the articles where the sentences were extracted from, and the neural MT models used to translate the text.

Lucia Specia | Vishrav Chaudhary | Marina Fomicheva | Shuo Sun | Erick Fonseca | Andr'e F. T. Martins | Fr'ed'eric Blain | Francisco Guzm'an | Nina Lopatina

[1] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.

[2] Jong-Hyeok Lee,et al. Predictor-Estimator using Multilevel Task Learning with Stack Propagation for Neural Quality Estimation , 2017, WMT.

[3] Myle Ott,et al. fairseq: A Fast, Extensible Toolkit for Sequence Modeling , 2019, NAACL.

[4] Matteo Negri,et al. Findings of the WMT 2019 Shared Task on Automatic Post-Editing , 2019, WMT.

[5] Lucia Specia,et al. Findings of the WMT 2018 Shared Task on Quality Estimation , 2018, WMT.

[6] Lucia Specia,et al. Quality Estimation for Machine Translation , 2018, Computational Linguistics.

[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[8] Holger Schwenk,et al. WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia , 2019, EACL.

[9] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[10] Lucia Specia,et al. Unsupervised Quality Estimation for Neural Machine Translation , 2020, Transactions of the Association for Computational Linguistics.

[11] André F. T. Martins,et al. Findings of the WMT 2019 Shared Tasks on Quality Estimation , 2019, WMT.

[12] André F. T. Martins,et al. OpenKiwi: An Open Source Framework for Quality Estimation , 2019, ACL.

[13] Alex Kulesza,et al. Confidence Estimation for Machine Translation , 2004, COLING.

[14] Mikel L. Forcada,et al. ParaCrawl: Web-scale parallel corpora for the languages of the EU , 2019, MTSummit.

[15] Timothy Baldwin,et al. Continuous Measurement Scales in Human Evaluation of Machine Translation , 2013, LAW@ACL.

[16] Philipp Koehn,et al. Two New Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English , 2019, ArXiv.

[17] Philipp Koehn,et al. Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.

[18] Lucia Specia,et al. Multi-Hypothesis Machine Translation Evaluation , 2020, ACL.

[19] Myle Ott,et al. Facebook FAIR’s WMT19 News Translation Task Submission , 2019, WMT.