论文信息 - End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification - 字舞流文

End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification

Autoregressive decoding is the only part of sequence-to-sequence models that prevents them from massive parallelization at inference time. Non-autoregressive models enable the decoder to generate all output symbols independently in parallel. We present a novel non-autoregressive architecture based on connectionist temporal classification and evaluate it on the task of neural machine translation. Unlike other non-autoregressive methods which operate in several steps, our model can be trained end-to-end. We conduct experiments on the WMT English-Romanian and English-German datasets. Our models achieve a significant speedup over the autoregressive models, keeping the translation quality comparable to other non-autoregressive models.

Jindrich Libovický | Jindrich Helcl | Jindřich Libovický | Jindřich Helcl

[1] Maria Leonor Pacheco,et al. of the Association for Computational Linguistics: , 2001 .

[2] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[3] Navdeep Jaitly,et al. Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[4] Samy Bengio,et al. Tensor2Tensor for Neural Machine Translation , 2018, AMTA.

[5] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[6] Philipp Koehn,et al. Findings of the 2013 Workshop on Statistical Machine Translation , 2013, WMT@ACL.

[7] Philipp Koehn,et al. Findings of the 2015 Workshop on Statistical Machine Translation , 2015, WMT@EMNLP.

[8] Jason Lee,et al. Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement , 2018, EMNLP.

[9] Martin Wattenberg,et al. Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[10] Colin Cherry,et al. A Systematic Comparison of Smoothing Techniques for Sentence-Level BLEU , 2014, WMT@ACL.

[11] Jindrich Libovický,et al. Neural Monkey: An Open-source Tool for Sequence Learning , 2017, Prague Bull. Math. Linguistics.

[12] Philipp Koehn,et al. Findings of the 2014 Workshop on Statistical Machine Translation , 2014, WMT@ACL.

[13] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[14] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[15] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[16] Karin M. Verspoor,et al. Findings of the 2016 Conference on Machine Translation , 2016, WMT.

[17] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[18] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[19] Matt Post,et al. We start by defining the recurrent architecture as implemented in S OCKEYE , following , 2018 .

[20] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.