Decoding with Large-Scale Neural Language Models Improves Translation
暂无分享,去创建一个
Ashish Vaswani | David Chiang | Yinggong Zhao | Victoria Fossum | Ashish Vaswani | Yinggong Zhao | Victoria Fossum | David Chiang
[1] R. Kronmal,et al. On the Alias Method for Generating Random Variables From a Discrete Distribution , 1979 .
[2] F ChenStanley,et al. An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.
[3] Yoshua Bengio,et al. A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..
[4] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.
[5] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.
[6] H. Schwenk,et al. Efficient training of large neural networks for language modeling , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).
[7] Jean-Luc Gauvain,et al. Training Neural Network Language Models on Very Large Corpora , 2005, HLT.
[8] Yoshua Bengio,et al. Hierarchical Probabilistic Neural Network Language Model , 2005, AISTATS.
[9] Holger Schwenk,et al. Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.
[10] David Chiang,et al. Hierarchical Phrase-Based Translation , 2007, CL.
[11] Holger Schwenk,et al. Continuous space language models , 2007, Comput. Speech Lang..
[12] Geoffrey E. Hinton,et al. A Scalable Hierarchical Distributed Language Model , 2008, NIPS.
[13] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[14] Aapo Hyvärinen,et al. Noise-contrastive estimation: A new estimation principle for unnormalized statistical models , 2010, AISTATS.
[15] Alexandre Allauzen,et al. Structured Output Layer neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[16] Lukás Burget,et al. Empirical Evaluation and Combination of Advanced Language Modeling Techniques , 2011, INTERSPEECH.
[17] Alon Lavie,et al. Language Model Rest Costs and Space-Efficient Storage , 2012, EMNLP.
[18] Yoshua Bengio,et al. Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.
[19] Yee Whye Teh,et al. A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.
[20] Jan Niehues,et al. Continuous space language models using restricted Boltzmann machines , 2012, IWSLT.
[21] Vysoké Učení,et al. Statistical Language Models Based on Neural Networks , 2012 .
[22] Holger Schwenk,et al. Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation , 2012, WLM@NAACL-HLT.
[23] Geoffrey E. Hinton,et al. On rectified linear units for speech processing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[24] Holger Schwenk,et al. CSLM - a modular open-source continuous space language modeling toolkit , 2013, INTERSPEECH.