暂无分享,去创建一个
Jakub W. Pachocki | Igor Mordatch | Ilya Sutskever | Szymon Sidor | Trapit Bansal | Ilya Sutskever | Igor Mordatch | Szymon Sidor | Trapit Bansal | J. Pachocki | I. Sutskever
[1] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[2] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[3] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[4] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[5] Michael L. Littman,et al. Markov Games as a Framework for Multi-Agent Reinforcement Learning , 1994, ICML.
[6] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[7] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[8] Risto Miikkulainen,et al. Competitive Coevolution through Evolutionary Complexification , 2011, J. Artif. Intell. Res..
[9] Jordan L. Boyd-Graber,et al. Opponent Modeling in Deep Reinforcement Learning , 2016, ICML.
[10] Ilya Kostrikov,et al. Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play , 2017, ICLR.
[11] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[12] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[13] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.
[15] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.
[16] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[17] Evan Herbst,et al. Character animation in two-player adversarial games , 2010, TOGS.
[18] Dilin Wang,et al. Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm , 2016, NIPS.
[19] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[20] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[21] Dorian Kodelja,et al. Multiagent cooperation and competition with deep reinforcement learning , 2015, PloS one.
[22] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[23] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[24] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[25] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.
[26] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[27] James Davidson,et al. Supervision via competition: Robot adversaries for learning tasks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).
[28] Karl Sims,et al. Evolving virtual creatures , 1994, SIGGRAPH.
[29] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.
[30] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[31] Rich Caruana,et al. Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.
[32] Marcin Andrychowicz,et al. Hindsight Experience Replay , 2017, NIPS.