论文信息 - Smart-Start Decoding for Neural Machine Translation - 字舞流文

Smart-Start Decoding for Neural Machine Translation

Most current neural machine translation models adopt a monotonic decoding order of either left-to-right or right-to-left. In this work, we propose a novel method that breaks up the limitation of these decoding orders, called Smart-Start decoding. More specifically, our method first predicts a median word. It starts to decode the words on the right side of the median word and then generates words on the left. We evaluate the proposed Smart-Start decoding method on three datasets. Experimental results show that the proposed method can significantly outperform strong baseline models.

Zhoujun Li | Dongdong Zhang | Juncheng Wan | Ming Zhou | Jian Yang | Shuming Ma | Dongdong Zhang | Ming Zhou | Zhoujun Li | Shuming Ma | Jian Yang | Juncheng Wan

[1] Alex Graves,et al. Neural Machine Translation in Linear Time , 2016, ArXiv.

[2] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[3] Eliyahu Kiperwasser,et al. Easy-First Dependency Parsing with Hierarchical Tree LSTMs , 2016, TACL.

[4] Jiajun Zhang,et al. Sequence Generation: From Both Sides to the Middle , 2019, IJCAI.

[5] Yoav Goldberg,et al. An Efficient Algorithm for Easy-First Non-Directional Dependency Parsing , 2010, NAACL.

[6] Yann Dauphin,et al. Convolutional Sequence to Sequence Learning , 2017, ICML.

[7] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[8] Georgi Georgiev,et al. Weighted maximum likelihood loss as a convenient shortcut to optimizing the F-measure of maximum entropy classifiers , 2013, RANLP.

[9] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[10] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.

[11] Jiajun Zhang,et al. Synchronous Bidirectional Neural Machine Translation , 2019, TACL.

[12] Di He,et al. Layer-Wise Coordination between Encoder and Decoder for Neural Machine Translation , 2018, NeurIPS.

[13] Hideki Nakayama,et al. Improving Beam Search by Removing Monotonic Constraint for Neural Machine Translation , 2018, ACL.

[14] Mingbo Ma,et al. When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size) , 2017, EMNLP.

[15] Qi Liu,et al. Insertion-based Decoding with Automatically Inferred Generation Order , 2019, Transactions of the Association for Computational Linguistics.

[16] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[17] Yann Dauphin,et al. Pay Less Attention with Lightweight and Dynamic Convolutions , 2019, ICLR.

[18] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[19] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[20] Rongrong Ji,et al. Asynchronous Bidirectional Decoding for Neural Machine Translation , 2018, AAAI.

[21] Kyunghyun Cho,et al. Non-Monotonic Sequential Text Generation , 2019, ICML.

[22] Rico Sennrich,et al. Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[23] Ashish Vaswani,et al. Self-Attention with Relative Position Representations , 2018, NAACL.

[24] Christopher Joseph Pal,et al. Twin Networks: Matching the Future for Sequence Generation , 2017, ICLR.

[25] George Kurian,et al. Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[26] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[27] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[28] Zuchao Li,et al. Effective Representation for Easy-First Dependency Parsing , 2018, PRICAI.

[29] Jakob Uszkoreit,et al. Insertion Transformer: Flexible Sequence Generation via Insertion Operations , 2019, ICML.

[30] Leonid Sigal,et al. Middle-Out Decoding , 2018, NeurIPS.

[31] Enhong Chen,et al. Regularizing Neural Machine Translation by Target-bidirectional Agreement , 2018, AAAI.