Dota 2 with Large Scale Deep Reinforcement Learning
暂无分享,去创建一个
Jakub W. Pachocki | Henrique Pondé de Oliveira Pinto | F. Wolski | Ilya Sutskever | Tim Salimans | Christopher Hesse | Scott Gray | R. Józefowicz | Greg Brockman | Vicki Cheung | Jonas Schneider | Jie Tang | Jonathan Raiman | Christopher Berner | Brooke Chan | Przemyslaw Debiak | Christy Dennison | David Farhi | Quirin Fischer | Shariq Hashme | Catherine Olsson | Michael Petrov | Jeremy Schlatter | Szymon Sidor | Susan Zhang | J. Pachocki | S. Gray | I. Sutskever | Filip Wolski
[1] O. H. Brownlee,et al. ACTIVITY ANALYSIS OF PRODUCTION AND ALLOCATION , 1952 .
[2] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[3] L. V. Allis,et al. Searching for solutions in games and artificial intelligence , 1994 .
[4] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[5] John N. Tsitsiklis,et al. Actor-Critic Algorithms , 1999, NIPS.
[6] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[7] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[8] Thomas Hofmann,et al. TrueSkill™: A Bayesian Skill Rating System , 2007 .
[9] Geoffrey J. Gordon,et al. A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning , 2010, AISTATS.
[10] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[11] Sergey Levine,et al. Guided Policy Search , 2013, ICML.
[12] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.
[13] Aditya Jain,et al. A comparative study of visual and auditory reaction times on the basis of gender and physical activity levels of medical first year students , 2015, International journal of applied & basic medical research.
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] Tianqi Chen,et al. Net2Net: Accelerating Learning via Knowledge Transfer , 2015, ICLR.
[16] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[17] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[18] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[19] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[20] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.
[21] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[22] Kaiming He,et al. Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.
[23] Martial Hebert,et al. Growing a Brain: Fine-Tuning by Increasing Model Capacity , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[24] David Barber,et al. Thinking Fast and Slow with Deep Learning and Tree Search , 2017, NIPS.
[25] Diederik P. Kingma,et al. GPU Kernels for Block-Sparse Weights , 2017 .
[26] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
[27] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[28] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[29] Yang You,et al. Scaling SGD Batch Size to 32K for ImageNet Training , 2017, ArXiv.
[30] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.
[31] David Budden,et al. Distributed Prioritized Experience Replay , 2018, ICLR.
[32] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[33] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[34] Derek Hoiem,et al. Learning without Forgetting , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[35] Joel Z. Leibo,et al. Human-level performance in first-person multiplayer games with population-based deep reinforcement learning , 2018, ArXiv.
[36] Yee Whye Teh,et al. Mix&Match - Agent Curricula for Reinforcement Learning , 2018, ICML.
[37] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.
[38] Dario Amodei,et al. An Empirical Model of Large-Batch Training , 2018, ArXiv.
[39] Demis Hassabis,et al. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.
[40] Max Jaderberg,et al. Open-ended Learning in Symmetric Zero-sum Games , 2019, ICML.
[41] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[42] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[43] Amos J. Storkey,et al. Exploration by Random Network Distillation , 2018, ICLR.
[44] Katja Hofmann,et al. The MineRL Competition on Sample Efficient Reinforcement Learning using Human Priors , 2019, ArXiv.
[45] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[46] Taehoon Kim,et al. Quantifying Generalization in Reinforcement Learning , 2018, ICML.