暂无分享,去创建一个
[1] Marc'Aurelio Ranzato,et al. Analyzing Uncertainty in Neural Machine Translation , 2018, ICML.
[2] Tara N. Sainath,et al. Deep Neural Network Language Models , 2012, WLM@NAACL-HLT.
[3] Jason Eisner,et al. Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model , 2018, AAAI.
[4] Graham Neubig,et al. Neural Lattice Language Models , 2018, TACL.
[5] Geoffrey E. Hinton,et al. Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.
[6] Joshua Goodman,et al. Classes for fast maximum entropy training , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).
[7] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[8] Wenlin Chen,et al. Strategies for Training Large Vocabulary Neural Language Models , 2015, ACL.
[9] Holger Schwenk,et al. Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation , 2012, WLM@NAACL-HLT.
[10] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[11] Ashish Vaswani,et al. Decoding with Large-Scale Neural Language Models Improves Translation , 2013, EMNLP.
[12] Richard Socher,et al. An Analysis of Neural Language Modeling at Multiple Scales , 2018, ArXiv.
[13] Noah Constant,et al. Character-Level Language Modeling with Deeper Self-Attention , 2018, AAAI.
[14] Phil Blunsom,et al. Pragmatic Neural Language Modelling in Machine Translation , 2014, NAACL.
[15] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[16] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[17] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[18] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[19] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[20] Peter Dayan,et al. Fast Parametric Learning with Activation Memorization , 2018, ICML.
[21] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[22] Moustapha Cissé,et al. Efficient softmax approximation for GPUs , 2016, ICML.
[23] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[25] Myle Ott,et al. Scaling Neural Machine Translation , 2018, WMT.
[26] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[27] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Restarts , 2016, ArXiv.
[28] Joris Pelemans,et al. Sparse Non-negative Matrix Language Modeling , 2016, Transactions of the Association for Computational Linguistics.
[29] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[30] Frank Hutter,et al. SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.
[31] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[32] Thorsten Brants,et al. One billion word benchmark for measuring progress in statistical language modeling , 2013, INTERSPEECH.
[33] Yonghui Wu,et al. Exploring the Limits of Language Modeling , 2016, ArXiv.
[34] Nicolas Usunier,et al. Improving Neural Language Models with a Continuous Cache , 2016, ICLR.