Findings of the WMT 2021 Shared Task on Efficient Translation

The machine translation efficiency task challenges participants to make their systems faster and smaller with minimal impact on translation quality. How much quality to sacrifice for efficiency depends upon the application, so participants were encouraged to make multiple submissions covering the space of trade-offs. In total, there were 53 submissions by 4 teams. There were GPU, single-core CPU, and multi-core CPU hardware tracks as well as batched throughput or single-sentence latency conditions. Submissions showed hundreds of millions of words can be translated for a dollar, average latency is 5–17 ms, and models fit in 7.5–150 MB.

[1]  Iryna Gurevych,et al.  Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation , 2020, EMNLP.

[2]  Roberts Rozis,et al.  Tilde MODEL - Multilingual Open Data for EU Languages , 2017, NODALIDA.

[3]  Verena Rieser,et al.  RankME: Reliable Human Ratings for Natural Language Generation , 2018, NAACL.

[4]  Ioannis Konstas,et al.  Findings of the Third Workshop on Neural Generation and Translation , 2019, EMNLP.

[5]  Jiatao Gu,et al.  Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade , 2020, FINDINGS.

[6]  Timothy Baldwin,et al.  Continuous Measurement Scales in Human Evaluation of Machine Translation , 2013, LAW@ACL.

[7]  Matt Post,et al.  A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[8]  Lucia Specia,et al.  The IWSLT 2019 Evaluation Campaign , 2019, IWSLT.

[9]  Jindrich Libovický,et al.  End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification , 2018, EMNLP.

[10]  Alon Lavie,et al.  COMET: A Neural Framework for MT Evaluation , 2020, EMNLP.

[11]  Noah A. Smith,et al.  Evaluating Gender Bias in Machine Translation , 2019, ACL.

[12]  Ioannis Konstas,et al.  Findings of the Fourth Workshop on Neural Generation and Translation , 2020, NGT@ACL.

[13]  Benjamin Van Durme,et al.  Efficient Online Scalar Annotation with Bounded Support , 2018, ACL.

[14]  Kenneth Heafield,et al.  Gender bias amplification during Speed-Quality optimization in Neural Machine Translation , 2021, ACL.

[15]  Mauro Cettolo,et al.  Overview of the IWSLT 2017 Evaluation Campaign , 2017, IWSLT.

[16]  Philipp Koehn,et al.  Findings of the 2020 Conference on Machine Translation (WMT20) , 2020, WMT.

[17]  Christian Federmann,et al.  Appraise Evaluation Framework for Machine Translation , 2018, COLING.

[18]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[19]  Graham Neubig,et al.  Findings of the Second Workshop on Neural Machine Translation and Generation , 2018, NMT@ACL.