A Transformer-Based Neural Machine Translation Model for Arabic Dialects That Utilizes Subword Units

Languages that allow free word order, such as Arabic dialects, are of significant difficulty for neural machine translation (NMT) because of many scarce words and the inefficiency of NMT systems to translate these words. Unknown Word (UNK) tokens represent the out-of-vocabulary words for the reason that NMT systems run with vocabulary that has fixed size. Scarce words are encoded completely as sequences of subword pieces employing the Word-Piece Model. This research paper introduces the first Transformer-based neural machine translation model for Arabic vernaculars that employs subword units. The proposed solution is based on the Transformer model that has been presented lately. The use of subword units and shared vocabulary within the Arabic dialect (the source language) and modern standard Arabic (the target language) enhances the behavior of the multi-head attention sublayers for the encoder by obtaining the overall dependencies between words of input sentence for Arabic vernacular. Experiments are carried out from Levantine Arabic vernacular (LEV) to modern standard Arabic (MSA) and Maghrebi Arabic vernacular (MAG) to MSA, Gulf–MSA, Nile–MSA, Iraqi Arabic (IRQ) to MSA translation tasks. Extensive experiments confirm that the suggested model adequately addresses the unknown word issue and boosts the quality of translation from Arabic vernaculars to Modern standard Arabic (MSA).

[1]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[4]  Christof Monz,et al.  What does Attention in Neural Machine Translation Pay Attention to? , 2017, IJCNLP.

[5]  Hazlina Hamdan,et al.  Narrow Convolutional Neural Network for Arabic Dialects Polarity Classification , 2019, IEEE Access.

[6]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[7]  Nadir Durrani,et al.  Hindi-to-Urdu Machine Translation through Transliteration , 2010, ACL.

[8]  Jörg Tiedemann,et al.  An Analysis of Encoder Representations in Transformer-Based Machine Translation , 2018, BlackboxNLP@EMNLP.

[9]  Andy Way,et al.  Translating Low-Resource Languages by Vocabulary Adaptation from Close Counterparts , 2017, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[10]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[11]  Chanjun Park,et al.  Ancient Korean Neural Machine Translation , 2020, IEEE Access.

[12]  Fares Aqlan,et al.  Arabic–Chinese Neural Machine Translation: Romanized Arabic as Subword Unit for Arabic-sourced Translation , 2019, IEEE Access.

[13]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[14]  Lemao Liu,et al.  Neural Machine Translation with Source Dependency Representation , 2017, EMNLP.

[15]  Laith H. Baniata,et al.  A Neural Machine Translation Model for Arabic Dialects That Utilises Multitask Learning (MTL) , 2018, Comput. Intell. Neurosci..

[16]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[17]  Holger Schwenk,et al.  Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[18]  Rahma Sellami,et al.  Collaboratively Constructed Linguistic Resources for Language Variants and their Exploitation in NLP Application - the case of Tunisian Arabic and the Social Media , 2014, LG-LP@COLING.

[19]  Laith H. Baniata,et al.  A Multitask-Based Neural Machine Translation Model with Part-of-Speech Tags Integration for Arabic Dialects , 2018 .

[20]  Marta R. Costa-jussà,et al.  A Neural Approach to Language Variety Translation , 2018, VarDial@COLING 2018.

[21]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[22]  Karima Meftouh,et al.  Machine Translation Experiments on PADIC: A Parallel Arabic DIalect Corpus , 2015, PACLIC.

[23]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[24]  Andrew McCallum,et al.  Linguistically-Informed Self-Attention for Semantic Role Labeling , 2018, EMNLP.

[25]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Kevin Knight,et al.  Deciphering Related Languages , 2017, EMNLP.

[27]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[28]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[29]  Yoshimasa Tsuruoka,et al.  Learning to Parse and Translate Improves Neural Machine Translation , 2017, ACL.

[30]  Khaled Shaalan,et al.  A Hybrid Approach for Converting Written Egyptian Colloquial Dialect into Diacritized Arabic , 2008 .

[31]  Mihael Arcan,et al.  Language Related Issues for Machine Translation between Closely Related South Slavic Languages , 2016, VarDial@COLING.

[32]  Marta R. Costa-jussà,et al.  Why Catalan-Spanish Neural Machine Translation? Analysis, comparison and combination with standard Rule and Phrase-based technologies , 2017, VarDial.

[33]  Yating Yang,et al.  Hierarchical Transfer Learning Architecture for Low-Resource Neural Machine Translation , 2019, IEEE Access.

[34]  Nan Yang,et al.  Dependency-to-Dependency Neural Machine Translation , 2018, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[35]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[36]  Kemal Oflazer,et al.  A Multidialectal Parallel Corpus of Arabic , 2014, LREC.

[37]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[38]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[39]  Zhang Zuping,et al.  A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS) , 2018, IEEE Access.

[40]  Kemal Oflazer,et al.  The MADAR Arabic Dialect Corpus and Lexicon , 2018, LREC.

[41]  Quoc V. Le,et al.  The Evolved Transformer , 2019, ICML.

[42]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[43]  Yonatan Belinkov,et al.  Analyzing the Structure of Attention in a Transformer Language Model , 2019, BlackboxNLP@ACL.

[44]  Chris Callison-Burch,et al.  Machine Translation of Arabic Dialects , 2012, NAACL.

[45]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[46]  Arianna Bisazza,et al.  Neural versus phrase-based MT quality: An in-depth analysis on English-German and English-French , 2018, Comput. Speech Lang..

[47]  Mikel L. Forcada,et al.  Asynchronous translations with recurrent neural nets , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[48]  Quang-Phuoc Nguyen,et al.  Korean-Vietnamese Neural Machine Translation System With Korean Morphological Analysis and Word Sense Disambiguation , 2019, IEEE Access.

[49]  Mike Schuster,et al.  Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[50]  Karima Meftouh,et al.  Machine translation for Arabic dialects (survey) , 2017, Inf. Process. Manag..

[51]  Marcello Federico,et al.  Neural Machine Translation into Language Varieties , 2018, WMT.