论文信息 - NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems - 字舞流文

NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems

Abstract In this paper, we present nmtpy, a flexible Python toolkit based on Theano for training Neural Machine Translation and other neural sequence-to-sequence architectures. nmtpy decouples the specification of a network from the training and inference utilities to simplify the addition of a new architecture and reduce the amount of boilerplate code to be written. nmtpy has been used for LIUM’s top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017.

Fethi Bougares | Ozan Caglayan | Loïc Barrault | Mercedes García-Martínez | Adrien Bardet | Walid Aransa | Fethi Bougares | Loïc Barrault | Ozan Caglayan | Mercedes García-Martínez | Adrien Bardet | Walid Aransa

[1] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[2] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[3] Alon Lavie,et al. METEOR: An Automatic Metric for MT Evaluation with High Levels of Correlation with Human Judgments , 2007, WMT@ACL.

[4] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[5] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[6] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.

[7] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.

[8] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[9] Surya Ganguli,et al. Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[10] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[11] Desmond Elliott,et al. Multi-Language Image Description with Neural Sequence Models , 2015, ArXiv.

[12] C. Lawrence Zitnick,et al. CIDEr: Consensus-based image description evaluation , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Yoshua Bengio,et al. Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[14] Xinlei Chen,et al. Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[15] Quoc V. Le,et al. Adding Gradient Noise Improves Learning for Very Deep Networks , 2015, ArXiv.

[16] Rico Sennrich,et al. A Joint Dependency Model of Morphological and Syntactic Structure for Statistical Machine Translation , 2015, EMNLP.

[17] Desmond Elliott,et al. Multilingual Image Description with Neural Sequence Models , 2015, 1510.04709.

[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20] Jürgen Schmidhuber,et al. Highway Networks , 2015, ArXiv.

[21] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22] Joost van de Weijer,et al. Does Multimodality Help Human and Machine for Translation and Image Captioning? , 2016, WMT.

[23] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[24] Philipp Koehn,et al. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2016 .

[25] Fethi Bougares,et al. Factored Neural Machine Translation Architectures , 2016, IWSLT.

[26] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[27] Fethi Bougares,et al. Multimodal Attention for Neural Machine Translation , 2016, ArXiv.

[28] Khalil Sima'an,et al. A Shared Task on Multimodal Machine Translation and Crosslingual Image Description , 2016, WMT.

[29] Geoffrey E. Hinton,et al. Layer Normalization , 2016, ArXiv.

[30] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[31] Jindrich Libovický,et al. Neural Monkey: An Open-source Tool for Sequence Learning , 2017, Prague Bull. Math. Linguistics.

[32] Joost van de Weijer,et al. LIUM-CVC Submissions for WMT18 Multimodal Translation Task , 2018, WMT.

[33] Rico Sennrich,et al. Nematus: a Toolkit for Neural Machine Translation , 2017, EACL.

[34] Fethi Bougares,et al. LIUM Machine Translation Systems for WMT17 News Translation Task , 2017, WMT.

[35] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.

[36] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.

[37] Alexander M. Rush,et al. OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[38] Desmond Elliott,et al. Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description , 2017, WMT.