Knowledge Distillation for Recurrent Neural Network Language Modeling with Trust Regularization
暂无分享,去创建一个
Yangyang Shi | Mei-Yuh Hwang | Xin Lei | Haoyu Sheng | M. Hwang | Yangyang Shi | X. Lei | Haoyu Sheng
[1] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.
[2] Nicolas Usunier,et al. Improving Neural Language Models with a Continuous Cache , 2016, ICLR.
[3] Wojciech Zaremba,et al. Recurrent Neural Network Regularization , 2014, ArXiv.
[4] Janet M. Baker,et al. The Design for the Wall Street Journal-based CSR Corpus , 1992, HLT.
[5] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[6] Yann LeCun,et al. Regularization of Neural Networks using DropConnect , 2013, ICML.
[7] Alexander M. Rush,et al. Character-Aware Neural Language Models , 2015, AAAI.
[8] Daniel Povey,et al. The Kaldi Speech Recognition Toolkit , 2011 .
[9] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[10] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[11] Ruslan Salakhutdinov,et al. Breaking the Softmax Bottleneck: A High-Rank RNN Language Model , 2017, ICLR.
[12] Christopher D. Manning,et al. Compression of Neural Machine Translation Models via Pruning , 2016, CoNLL.
[13] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[14] Sachin S. Talathi,et al. Fixed Point Quantization of Deep Convolutional Networks , 2015, ICML.
[15] Geoffrey Zweig,et al. Context dependent recurrent neural network language model , 2012, 2012 IEEE Spoken Language Technology Workshop (SLT).
[16] Misha Denil,et al. Predicting Parameters in Deep Learning , 2014 .
[17] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.
[18] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[19] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.
[20] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[21] Holger Schwenk,et al. Large, Pruned or Continuous Space Language Models on a GPU for Statistical Machine Translation , 2012, WLM@NAACL-HLT.
[22] Yifan Gong,et al. Restructuring of deep neural network acoustic models with singular value decomposition , 2013, INTERSPEECH.
[23] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[24] Chris Dyer,et al. On the State of the Art of Evaluation in Neural Language Models , 2017, ICLR.