Adaptive Machine Translation with Large Language Models

Consistency is a key requirement of high-quality translation. It is especially important to adhere to pre-approved terminology and adapt to corrected translations in domain-specific projects. Machine translation (MT) has achieved significant progress in the area of domain adaptation. However, real-time adaptation remains challenging. Large-scale language models (LLMs) have recently shown interesting capabilities of in-context learning, where they learn to replicate certain input-output text generation patterns, without further fine-tuning. By feeding an LLM at inference time with a prompt that consists of a list of translation pairs, it can then simulate the domain and style characteristics. This work aims to investigate how we can utilize in-context learning to improve real-time adaptive MT. Our extensive experiments show promising results at translation time. For example, LLMs can adapt to a set of in-domain sentence pairs and/or terminology while translating a new sentence. We observe that the translation quality with few-shot in-context learning can surpass that of strong encoder-decoder MT systems, especially for high-resource languages. Moreover, we investigate whether we can combine MT from strong encoder-decoder models with fuzzy matches, which can further improve translation quality, especially for less supported languages. We conduct our experiments across five diverse language pairs, namely English-to-Arabic (EN-AR), English-to-Chinese (EN-ZH), English-to-French (EN-FR), English-to-Kinyarwanda (EN-RW), and English-to-Spanish (EN-ES).

[1]  Naman Goyal,et al.  LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.

[2]  Alexandra Birch,et al.  Prompting Large Language Model for Machine Translation: A Case Study , 2023, ICML.

[3]  Dragomir R. Radev,et al.  Crosslingual Generalization through Multitask Finetuning , 2022, ArXiv.

[4]  John D. Kelleher,et al.  Domain-Specific Text Generation for Machine Translation , 2022, AMTA.

[5]  Shannon L. Spruit,et al.  No Language Left Behind: Scaling Human-Centered Machine Translation , 2022, ArXiv.

[6]  Andrew M. Dai,et al.  PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..

[7]  Ryan J. Lowe,et al.  Training language models to follow instructions with human feedback , 2022, NeurIPS.

[8]  Marc'Aurelio Ranzato,et al.  The Flores-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation , 2021, TACL.

[9]  Thierry Etchegoyhen,et al.  Online Learning over Time in Adaptive Neural Machine Translation , 2021, RANLP.

[10]  Andy Way,et al.  Terminology-Aware Sentence Mining for NMT Domain Adaptation: ADAPT’s Submission to the Adap-MT 2020 English-to-Hindi AI Translation Shared Task , 2021 .

[11]  Jean Senellart,et al.  Integrating Domain Terminology into Neural Machine Translation , 2020, COLING.

[12]  François Yvon,et al.  Priming Neural Machine Translation , 2020, WMT.

[13]  Jorg Tiedemann,et al.  The Tatoeba Translation Challenge – Realistic Data Sets for Low Resource and Multilingual MT , 2020, WMT.

[14]  Federico Nanni,et al.  DeezyMatch: A Flexible Deep Learning Approach to Fuzzy String Matching , 2020, EMNLP.

[15]  Graham Neubig,et al.  TICO-19: the Translation Initiative for Covid-19 , 2020, NLP4COVID@EMNLP.

[16]  Dakun Zhang,et al.  Efficient and High-Quality Neural Machine Translation with OpenNMT , 2020, NGT.

[17]  Josep Maria Crego,et al.  Boosting Neural Machine Translation with Similar Translations , 2020, ACL.

[18]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[19]  Iryna Gurevych,et al.  Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks , 2019, EMNLP.

[20]  Arda Tezcan,et al.  Neural Fuzzy Repair: Integrating Fuzzy Matches into Neural Machine Translation , 2019, ACL.

[21]  Yaser Al-Onaizan,et al.  Training Neural Machine Translation to Apply Terminology Constraints , 2019, ACL.

[22]  Jaime Carbonell,et al.  Domain Adaptation of Neural Machine Translation by Lexicon Induction , 2019, ACL.

[23]  Francisco Casacuberta,et al.  Online Learning for Effort Reduction in Interactive Neural Machine Translation , 2018, Comput. Speech Lang..

[24]  John DeNero,et al.  Compact Personalized Models for Neural Machine Translation , 2018, EMNLP.

[25]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[26]  Matt Post,et al.  Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation , 2018, NAACL.

[27]  Philipp Koehn,et al.  A Comparison of Machine Translation Paradigms for Use in Black-Box Fuzzy-Match Repair , 2018, PostEditing@AMTA.

[28]  Marcello Federico,et al.  Multi-Domain Neural Machine Translation through Unsupervised Adaptation , 2017, WMT.

[29]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[30]  Qun Liu,et al.  Lexically Constrained Decoding for Sequence Generation Using Grid Beam Search , 2017, ACL.

[31]  Aidong Zhang,et al.  A Survey on Context Learning , 2017, IEEE Transactions on Knowledge and Data Engineering.

[32]  Kevin Heffernan Kevin Heffernan , 2005 .

[33]  Deborah A. Coughlin,et al.  Correlating automated and human assessments of machine translation quality , 2003, MTSUMMIT.

[34]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .