论文信息 - Vocabulary Selection Strategies for Neural Machine Translation - 字舞流文

Vocabulary Selection Strategies for Neural Machine Translation

Classical translation models constrain the space of possible outputs by selecting a subset of translation rules based on the input sentence. Recent work on improving the efficiency of neural translation models adopted a similar strategy by restricting the output vocabulary to a subset of likely candidates given the source. In this paper we experiment with context and embedding-based selection methods and extend previous work by examining speed and accuracy trade-offs in more detail. We show that decoding time on CPUs can be reduced by up to 90% and training time by 25% on the WMT15 English-German and WMT16 English-Romanian tasks at the same or only negligible change in accuracy. This brings the time to decode with a state of the art neural translation system to just over 140 words per seconds on a single CPU core for English-German.

David Grangier | Michael Auli | Gurvan L'Hostis | David Grangier | Michael Auli | Gurvan L'Hostis

[1] Chris Callison-Burch,et al. Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[2] Ronan Collobert,et al. Word Embeddings through Hellinger PCA , 2013, EACL.

[3] Dan Tufis. A Cheap and Fast Way to Build Useful Translation Lexicons , 2002, COLING.

[4] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[5] Rico Sennrich,et al. Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[6] Geoffrey E. Hinton,et al. A Scalable Hierarchical Distributed Language Model , 2008, NIPS.

[7] Moustapha Cissé,et al. Efficient softmax approximation for GPUs , 2016, ICML.

[8] Léon Bottou,et al. Large-Scale Machine Learning with Stochastic Gradient Descent , 2010, COMPSTAT.

[9] Yoshua Bengio,et al. BilBOWA: Fast Bilingual Distributed Representations without Word Alignments , 2014, ICML.

[10] Quoc V. Le,et al. Addressing the Rare Word Problem in Neural Machine Translation , 2014, ACL.

[11] Christian Federmann,et al. Proceedings of the Tenth Workshop on Statistical Machine Translation, WMT@EMNLP 2015, 17-18 September 2015, Lisbon, Portugal , 2015, WMT@EMNLP.

[12] Daniel Marcu,et al. Statistical Phrase-Based Translation , 2003, NAACL.

[13] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[14] Zhiguo Wang,et al. Vocabulary Manipulation for Neural Machine Translation , 2016, ACL.

[15] Lukás Burget,et al. Strategies for training large scale neural network language models , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[16] Philip Koehn,et al. Statistical Machine Translation , 2010, EAMT.

[17] Wei Xu,et al. Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation , 2016, TACL.

[18] Wenlin Chen,et al. Strategies for Training Large Vocabulary Neural Language Models , 2015, ACL.

[19] Francis M. Tyers,et al. Flexible finite-state lexical selection for rule-based machine translation , 2012, EAMT.

[20] Yoshua Bengio,et al. Montreal Neural Machine Translation Systems for WMT’15 , 2015, WMT@EMNLP.

[21] Robert L. Mercer,et al. The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[22] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[23] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.

[24] Yoshua Bengio,et al. On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[25] Christopher D. Manning,et al. Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[26] Noah A. Smith,et al. A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[27] Christopher D. Manning,et al. Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models , 2016, ACL.

[28] Srinivas Bangalore,et al. Statistical Machine Translation through Global Lexical Selection and Sentence Reconstruction , 2007, ACL.

[29] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[30] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[31] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.