Handling OOV Words in NMT Using Unsupervised Bilingual Embedding

Neural machine translation has recently become the premier approach in Machine Translation however, it still has some unsolved issues. In this paper we have focused on handling the out-of-vocabulary (OOV) words as an open problem in neural machine translation. The method we introduce in this paper chooses appropriate alternative words inside the vocabulary for the OOV words by considering the word embeddings trained on monolingual corpora. Both monolingual and bilingual embeddings are used in finding the proper substitute for each OOV word. Using this technique we have improved the quality of translation up to 2.3 BLEU without using any additional annotated data.

[1]  Quoc V. Le,et al.  Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[2]  Mikio Yamamoto,et al.  Neural Machine Translation Model with a Large Vocabulary Selected by Branching Entropy , 2017, MTSUMMIT.

[3]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[4]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[7]  Shahram Khadivi,et al.  Developing an Open-domain English-Farsi Translation System Using AFEC: Amirkabir Bilingual Farsi-English Corpus , 2012, AMTA.

[8]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[9]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[10]  Phil Blunsom,et al.  Recurrent Continuous Translation Models , 2013, EMNLP.

[11]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[12]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[13]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[14]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[15]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[16]  M. A. Farajian,et al.  PEN: Parallel English-Persian news corpus , 2011 .

[17]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[18]  Heshaam Faili,et al.  TEP: Tehran English-Persian Parallel Corpus , 2011, CICLing.

[19]  Jan Niehues,et al.  Towards one-shot learning for rare-word translation with external experts , 2018, NMT@ACL.

[20]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[21]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[22]  Jiajun Zhang,et al.  Towards Zero Unknown Word in Neural Machine Translation , 2016, IJCAI.