暂无分享,去创建一个
[1] Alexander M. Rush,et al. Structured Attention Networks , 2017, ICLR.
[2] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[3] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.
[4] Samy Bengio,et al. Tensor2Tensor for Neural Machine Translation , 2018, AMTA.
[5] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[6] Xiaocheng Feng,et al. Topic-to-Essay Generation with Neural Networks , 2018, IJCAI.
[7] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.
[8] Daniel Jurafsky,et al. Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context , 2018, ACL.
[9] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[10] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[11] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .
[12] Leslie Lamport,et al. Latex : A Document Preparation System , 1985 .
[13] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[14] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.
[15] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[16] 池内 健二,et al. Document preparation system , 2006 .
[17] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[18] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[19] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[20] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[21] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .