Stabilizing Transformers for Reinforcement Learning
暂无分享,去创建一个
Razvan Pascanu | H. Francis Song | Raia Hadsell | Emilio Parisotto | Max Jaderberg | Nicolas Heess | Çaglar Gülçehre | Siddhant M. Jayakumar | Aidan Clark | Jack W. Rae | Raphael Lopez Kaufman | Seb Noury | Matthew M. Botvinick | Max Jaderberg | R. Hadsell | N. Heess | H. F. Song | Çaglar Gülçehre | M. Botvinick | Razvan Pascanu | Seb Noury | Aidan Clark | Emilio Parisotto
[1] Wojciech Czarnecki,et al. Multi-task Deep Reinforcement Learning with PopArt , 2018, AAAI.
[2] Lukasz Kaiser,et al. Generating Wikipedia by Summarizing Long Sequences , 2018, ICLR.
[3] Tie-Yan Liu,et al. On Layer Normalization in the Transformer Architecture , 2020, ICML.
[4] Alex Graves,et al. Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.
[5] Yoshua Bengio,et al. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.
[6] Klaus-Robert Müller,et al. Efficient BackProp , 2012, Neural Networks: Tricks of the Trade.
[7] Yiming Yang,et al. Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.
[8] Pieter Abbeel,et al. A Simple Neural Attentive Meta-Learner , 2017, ICLR.
[9] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .
[10] G. Kane. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol 1: Foundations, vol 2: Psychological and Biological Models , 1994 .
[11] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[12] Mirella Lapata,et al. Text Summarization with Pretrained Encoders , 2019, EMNLP.
[13] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.
[14] Joel Z. Leibo,et al. Unsupervised Predictive Memory in a Goal-Directed Agent , 2018, ArXiv.
[15] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[16] Demis Hassabis,et al. Neural Episodic Control , 2017, ICML.
[17] Geoffrey E. Hinton,et al. On the importance of initialization and momentum in deep learning , 2013, ICML.
[18] Ruslan Salakhutdinov,et al. Neural Map: Structured Memory for Deep Reinforcement Learning , 2017, ICLR.
[19] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..
[20] Joel Z. Leibo,et al. Model-Free Episodic Control , 2016, ArXiv.
[21] Asli Celikyilmaz,et al. Working Memory Graphs , 2020, ICML.
[22] Rahul Sukthankar,et al. Cognitive Mapping and Planning for Visual Navigation , 2017, International Journal of Computer Vision.
[23] Razvan Pascanu,et al. Relational recurrent neural networks , 2018, NeurIPS.
[24] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[25] Rémi Munos,et al. Recurrent Experience Replay in Distributed Reinforcement Learning , 2018, ICLR.
[26] Yuval Tassa,et al. Relative Entropy Regularized Policy Iteration , 2018, ArXiv.
[27] Razvan Pascanu,et al. Deep reinforcement learning with relational inductive biases , 2018, ICLR.
[28] Alex Graves,et al. Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.
[29] Quoc V. Le,et al. Neural Architecture Search with Reinforcement Learning , 2016, ICLR.
[30] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[31] Lukasz Kaiser,et al. Universal Transformers , 2018, ICLR.
[32] Ying Zhang,et al. On Multiplicative Integration with Recurrent Neural Networks , 2016, NIPS.
[33] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[34] Yoshua Bengio,et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.
[35] Myle Ott,et al. Understanding Back-Translation at Scale , 2018, EMNLP.
[36] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.
[37] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[38] Julian Salazar,et al. Transformers without Tears: Improving the Normalization of Self-Attention , 2019, ArXiv.
[39] Steve Renals,et al. Multiplicative LSTM for sequence modelling , 2016, ICLR.
[40] Kenneth O. Stanley,et al. Differentiable plasticity: training plastic neural networks with backpropagation , 2018, ICML.
[41] Jian Sun,et al. Identity Mappings in Deep Residual Networks , 2016, ECCV.
[42] Alex Graves,et al. Grid Long Short-Term Memory , 2015, ICLR.
[43] Jürgen Schmidhuber,et al. Recurrent Highway Networks , 2016, ICML.
[44] Sergio Gomez Colmenarejo,et al. TF-Replicator: Distributed Machine Learning for Researchers , 2019, ArXiv.
[45] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[46] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[47] Yee Whye Teh,et al. Meta reinforcement learning as task inference , 2019, ArXiv.
[48] Yann Dauphin,et al. Language Modeling with Gated Convolutional Networks , 2016, ICML.
[49] Alexei Baevski,et al. Adaptive Input Representations for Neural Language Modeling , 2018, ICLR.
[50] Martin A. Riedmiller,et al. Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics , 2020, CoRL.
[51] Max Jaderberg,et al. Perception-Prediction-Reaction Agents for Deep Reinforcement Learning , 2020, ArXiv.
[52] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[53] H. Francis Song,et al. V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control , 2019, ICLR.