暂无分享,去创建一个
Guy Lever | Siqi Liu | Thore Graepel | Nicolas Heess | Saran Tunyasuvunakool | Josh Merel | N. Heess | J. Merel | S. Tunyasuvunakool | Siqi Liu | Siqi Liu | Guy Lever | T. Graepel
[1] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[2] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[3] A. Elo. The rating of chessplayers, past and present , 1978 .
[4] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[5] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[6] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[7] Hiroaki Kitano,et al. RoboCup: The Robot World Cup Initiative , 1997, AGENTS '97.
[8] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[9] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[10] Andrew Y. Ng,et al. On Local Rewards and Scaling Distributed Reinforcement Learning , 2005, NIPS.
[11] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[12] Martin A. Riedmiller,et al. Reinforcement learning for robot soccer , 2009, Auton. Robots.
[13] Richard L. Lewis,et al. Optimal rewards in multiagent teams , 2012, 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL).
[14] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[15] Guy Lever,et al. Deterministic Policy Gradient Algorithms , 2014, ICML.
[16] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[18] David Silver,et al. Memory-based control with recurrent neural networks , 2015, ArXiv.
[19] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[20] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[21] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[22] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[23] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[24] Matthew J. Hausknecht,et al. Cooperation and communication in multiagent deep reinforcement learning , 2016 .
[25] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[26] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[27] Sepp Hochreiter,et al. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.
[28] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.
[29] Kevin Waugh,et al. DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker , 2017, ArXiv.
[30] Shimon Whiteson,et al. Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning , 2017, ICML.
[31] Tom Schaul,et al. StarCraft II: A New Challenge for Reinforcement Learning , 2017, ArXiv.
[32] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[33] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[34] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[35] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[36] Yuval Tassa,et al. Data-efficient Deep Reinforcement Learning for Dexterous Manipulation , 2017, ArXiv.
[37] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[38] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[39] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.
[40] Joel Z. Leibo,et al. A Generalised Method for Empirical Game Theoretic Analysis , 2018, AAMAS.
[41] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[42] Joel Z. Leibo,et al. Human-level performance in first-person multiplayer games with population-based deep reinforcement learning , 2018, ArXiv.
[43] Patrick MacAlpine,et al. Overlapping layered learning , 2018, Artif. Intell..
[44] Yuval Tassa,et al. DeepMind Control Suite , 2018, ArXiv.
[45] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.
[46] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[47] Thore Graepel,et al. Re-evaluating evaluation , 2018, NeurIPS.