暂无分享,去创建一个
[1] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[2] Steve Renals,et al. Multiplicative LSTM for sequence modelling , 2016, ICLR.
[3] Ying Zhang,et al. On Multiplicative Integration with Recurrent Neural Networks , 2016, NIPS.
[4] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[5] Richard Socher,et al. An Analysis of Neural Language Modeling at Multiple Scales , 2018, ArXiv.
[6] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[7] Jascha Sohl-Dickstein,et al. Input Switched Affine Networks: An RNN Architecture Designed for Interpretability , 2016, ICML.
[8] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.
[9] Dhruv Batra,et al. Analyzing the Behavior of Visual Question Answering Models , 2016, EMNLP.
[10] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[11] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[12] Lior Wolf,et al. Using the Output Embedding to Improve Language Models , 2016, EACL.
[13] P J Webros. BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .
[14] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.
[15] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.
[16] Sebastian Ruder,et al. Universal Language Model Fine-tuning for Text Classification , 2018, ACL.
[17] Jürgen Schmidhuber,et al. LSTM can Solve Hard Long Time Lag Problems , 1996, NIPS.
[18] Chris Dyer,et al. Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling , 2017, ACL.
[19] Rico Sennrich,et al. Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.
[20] Quoc V. Le,et al. Sequence to Sequence Learning with Neural Networks , 2014, NIPS.
[21] Vladlen Koltun,et al. Trellis Networks for Sequence Modeling , 2018, ICLR.
[22] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.
[23] Yoshua Bengio,et al. Unitary Evolution Recurrent Neural Networks , 2015, ICML.
[24] D. Sculley,et al. Google Vizier: A Service for Black-Box Optimization , 2017, KDD.
[25] Yonatan Belinkov,et al. Synthetic and Natural Noise Both Break Neural Machine Translation , 2017, ICLR.
[26] Bram Bakker,et al. Reinforcement Learning with Long Short-Term Memory , 2001, NIPS.
[27] John Hale,et al. LSTMs Can Learn Syntax-Sensitive Dependencies Well, But Modeling Structure Makes Them Better , 2018, ACL.
[28] Richard Socher,et al. Regularizing and Optimizing LSTM Language Models , 2017, ICLR.
[29] Luke S. Zettlemoyer,et al. Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.
[30] Andrew W. Senior,et al. Long Short-Term Memory Based Recurrent Neural Network Architectures for Large Vocabulary Speech Recognition , 2014, ArXiv.
[31] Richard Socher,et al. Pointer Sentinel Mixture Models , 2016, ICLR.
[32] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.
[33] 知秀 柴田. 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .
[34] Percy Liang,et al. Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.
[35] Steve Renals,et al. Dynamic Evaluation of Neural Sequence Models , 2017, ICML.
[36] Di He,et al. FRAGE: Frequency-Agnostic Word Representation , 2018, NeurIPS.
[37] Emmanuel Dupoux,et al. Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies , 2016, TACL.
[38] Hakan Inan,et al. Tying Word Vectors and Word Classifiers: A Loss Framework for Language Modeling , 2016, ICLR.
[39] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[40] Chris Dyer,et al. On the State of the Art of Evaluation in Neural Language Models , 2017, ICLR.
[41] Michael Strube,et al. Lexical Features in Coreference Resolution: To be Used With Caution , 2017, ACL.
[42] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[43] Ruslan Salakhutdinov,et al. Breaking the Softmax Bottleneck: A High-Rank RNN Language Model , 2017, ICLR.
[44] Chong Wang,et al. Sequence Modeling via Segmentations , 2017, ICML.
[45] Chris Dyer,et al. Pushing the bounds of dropout , 2018, ArXiv.
[46] Yoshua Bengio,et al. Gated Feedback Recurrent Neural Networks , 2015, ICML.
[47] Steve Renals,et al. Dynamic Evaluation of Transformer Language Models , 2019, ArXiv.
[48] Zoubin Ghahramani,et al. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks , 2015, NIPS.
[49] Jürgen Schmidhuber,et al. A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks , 2006 .