Human-level performance in first-person multiplayer games with population-based deep reinforcement learning
暂无分享,去创建一个
Joel Z. Leibo | Ari S. Morcos | Neil C. Rabinowitz | Antonio García Castañeda | Wojciech M. Czarnecki | Max Jaderberg | K. Kavukcuoglu | D. Hassabis | D. Silver | Charlie Beattie | Avraham Ruderman | Guy Lever | T. Graepel | Luke Marris | Tim Green | Iain Dunning | A. Castañeda | Nicolas Sonnerat | Louise Deason | David Silver
[1] Jeff Orkin,et al. Three States and a Plan: The A.I. of F.E.A.R. , 2006 .
[2] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[3] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[4] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[5] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[6] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.
[7] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[8] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[9] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[10] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[11] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[12] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[13] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[14] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[15] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[16] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.
[17] Martin A. Riedmiller,et al. On Experiences in a Complex and Competitive Gaming Domain: Reinforcement Learning Meets RoboCup , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.
[18] E. Hellinger,et al. Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. , 1909 .
[19] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.
[20] R. Quiroga. Concept cells: the building blocks of declarative memory functions , 2012, Nature Reviews Neuroscience.
[21] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[22] David H. Ackley,et al. Interactions between learning and evolution , 1991 .
[23] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[24] Jeremy R. Cooperstock,et al. On the Limits of the Human Motor Control Precision: The Search for a Device's Human Resolution , 2011, INTERACT.
[25] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[26] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[27] Ryan P. Adams,et al. Mapping Sub-Second Structure in Mouse Behavior , 2015, Neuron.
[28] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[29] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[30] Frans Mäyrä,et al. Fundamental Components of the Gameplay Experience: Analysing Immersion , 2005, DiGRA Conference.
[31] Julian Togelius,et al. Hierarchical controller learning in a First-Person Shooter , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.
[32] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[33] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[34] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[35] A. Elo. The rating of chessplayers, past and present , 1978 .
[36] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[37] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[38] David Silver,et al. Reinforced Variational Inference , 2015, NIPS 2015.
[39] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[40] Kagan Tumer,et al. An Introduction to Collective Intelligence , 1999, ArXiv.
[41] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[42] Yoshua Bengio,et al. Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.
[43] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[44] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[45] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.
[46] M. A. MacIver,et al. Neuroscience Needs Behavior: Correcting a Reductionist Bias , 2017, Neuron.
[47] Sergey Levine,et al. Variational Policy Search via Trajectory Optimization , 2013, NIPS.
[48] J. Pratt,et al. The effects of action video game experience on the time course of inhibition of return and the efficiency of visual search. , 2005, Acta psychologica.
[49] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[50] Manuela M. Veloso,et al. Layered Learning , 2000, ECML.
[51] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[52] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[53] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[54] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[55] Hiroaki Kitano,et al. RoboCup: A Challenge Problem for AI and Robotics , 1997, RoboCup.
[56] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[57] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[58] Richard K. Belew,et al. New Methods for Competitive Coevolution , 1997, Evolutionary Computation.
[59] Patrick MacAlpine,et al. UT Austin Villa: RoboCup 2016 3D Simulation League Competition and Technical Challenges Champions , 2015, Robot Soccer World Cup.
[60] John E. Laird,et al. Human-Level AI's Killer Application: Interactive Computer Games , 2000, AI Mag..
[61] Sarit Kraus,et al. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.
[62] Ole Winther,et al. Sequential Neural Models with Stochastic Layers , 2016, NIPS.
[63] C. Honey,et al. Processing Timescales as an Organizing Principle for Primate Cortex , 2015, Neuron.