Human-level performance in 3D multiplayer games with population-based reinforcement learning
暂无分享,去创建一个
Guy Lever | Joel Z. Leibo | Demis Hassabis | Max Jaderberg | Wojciech Czarnecki | Tim Green | Iain Dunning | Koray Kavukcuoglu | David Silver | Thore Graepel | Ari S. Morcos | Avraham Ruderman | Neil C. Rabinowitz | Nicolas Sonnerat | Charles Beattie | Antonio García Castañeda | Luke Marris | Louise Deason | Wojciech M. Czarnecki | Max Jaderberg | K. Kavukcuoglu | D. Hassabis | D. Silver | Charlie Beattie | Avraham Ruderman | Guy Lever | T. Graepel | Luke Marris | Tim Green | Iain Dunning | A. Castañeda | Nicolas Sonnerat | Louise Deason | David Silver
[1] E. Hellinger,et al. Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. , 1909 .
[2] P. N. Rasmussen. Tjalling C. Koopmans (edt.), Activity Analysis of Production and Allocation. Cowles Commission for Research in Economics, Monograph No. 13. John Wiley & Sons, New York, and Chapman & Hall, London, 1951. 404 sider. $ 4,50. , 1952 .
[3] A. Elo. The rating of chessplayers, past and present , 1978 .
[4] David H. Ackley,et al. Interactions between learning and evolution , 1991 .
[5] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[6] P. Greenfield,et al. Action video games and informal education: Effects on strategies for dividing visual attention , 1994 .
[7] Gerald Tesauro,et al. Temporal difference learning and TD-Gammon , 1995, CACM.
[8] Yoshua Bengio,et al. Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.
[9] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[10] Hiroaki Kitano,et al. RoboCup: A Challenge Problem for AI and Robotics , 1997, RoboCup.
[11] Richard K. Belew,et al. New Methods for Competitive Coevolution , 1997, Evolutionary Computation.
[12] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[13] Andrew Y. Ng,et al. Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping , 1999, ICML.
[14] Kagan Tumer,et al. An Introduction to Collective Intelligence , 1999, ArXiv.
[15] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[16] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[17] Manuela M. Veloso,et al. Layered Learning , 2000, ECML.
[18] John E. Laird,et al. Human-Level AI's Killer Application: Interactive Computer Games , 2000, AI Mag..
[19] Frans Mäyrä,et al. Fundamental Components of the Gameplay Experience: Analysing Immersion , 2005, DiGRA Conference.
[20] J. Pratt,et al. The effects of action video game experience on the time course of inhibition of return and the efficiency of visual search. , 2005, Acta psychologica.
[21] Jeff Orkin,et al. Three States and a Plan: The A.I. of F.E.A.R. , 2006 .
[22] Peter Stone,et al. Know Thine Enemy: A Champion RoboCup Coach Agent , 2006, AAAI.
[23] Martin A. Riedmiller,et al. On Experiences in a Complex and Competitive Gaming Domain: Reinforcement Learning Meets RoboCup , 2007, 2007 IEEE Symposium on Computational Intelligence and Games.
[24] Geoffrey E. Hinton,et al. Visualizing Data using t-SNE , 2008 .
[25] C Shawn Green,et al. Increasing Speed of Processing With Action Video Games , 2009, Current directions in psychological science.
[26] Julian Togelius,et al. Hierarchical controller learning in a First-Person Shooter , 2009, 2009 IEEE Symposium on Computational Intelligence and Games.
[27] Richard L. Lewis,et al. Where Do Rewards Come From , 2009 .
[28] Marc Toussaint,et al. Learning model-free robot control by a Monte Carlo EM algorithm , 2009, Auton. Robots.
[29] Richard L. Lewis,et al. Intrinsically Motivated Reinforcement Learning: An Evolutionary Perspective , 2010, IEEE Transactions on Autonomous Mental Development.
[30] Sarit Kraus,et al. Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination , 2010, AAAI.
[31] Jeremy R. Cooperstock,et al. On the Limits of the Human Motor Control Precision: The Search for a Device's Human Resolution , 2011, INTERACT.
[32] R. Quiroga. Concept cells: the building blocks of declarative memory functions , 2012, Nature Reviews Neuroscience.
[33] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[34] Sergey Levine,et al. Variational Policy Search via Trajectory Optimization , 2013, NIPS.
[35] Daan Wierstra,et al. Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.
[36] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.
[37] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.
[38] Juan Carlos Fernández,et al. Multiobjective evolutionary algorithms to identify highly autocorrelated areas: the case of spatial distribution in financially compromised farms , 2014, Ann. Oper. Res..
[39] Andrew Zisserman,et al. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.
[40] Yoshua Bengio,et al. A Recurrent Latent Variable Model for Sequential Data , 2015, NIPS.
[41] Ryan P. Adams,et al. Mapping Sub-Second Structure in Mouse Behavior , 2015, Neuron.
[42] David Silver,et al. Reinforced Variational Inference , 2015, NIPS 2015.
[43] C. S. Green,et al. Action video game training for cognitive enhancement , 2015, Current Opinion in Behavioral Sciences.
[44] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[45] C. Honey,et al. Processing Timescales as an Organizing Principle for Primate Cortex , 2015, Neuron.
[46] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[47] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[48] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[49] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[50] Sergio Gomez Colmenarejo,et al. Hybrid computing using a neural network with dynamic external memory , 2016, Nature.
[51] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[52] David Silver,et al. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games , 2016, ArXiv.
[53] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[54] Samy Bengio,et al. Generating Sentences from a Continuous Space , 2015, CoNLL.
[55] Ole Winther,et al. Sequential Neural Models with Stochastic Layers , 2016, NIPS.
[56] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[57] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[58] Gorjan Alagic,et al. #p , 2019, Quantum information & computation.
[59] Yoshua Bengio,et al. Hierarchical Multiscale Recurrent Neural Networks , 2016, ICLR.
[60] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[61] Kevin Waugh,et al. DeepStack: Expert-level artificial intelligence in heads-up no-limit poker , 2017, Science.
[62] Yuandong Tian,et al. Training Agent for First-Person Shooter Game with Actor-Critic Curriculum Learning , 2016, ICLR.
[63] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[64] David Silver,et al. A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning , 2017, NIPS.
[65] Joël Billieux,et al. Shoot at first sight! First person shooter players display reduced reaction time and compromised inhibitory control in comparison to other video game players , 2017, Comput. Hum. Behav..
[66] Demis Hassabis,et al. Mastering the game of Go without human knowledge , 2017, Nature.
[67] M. A. MacIver,et al. Neuroscience Needs Behavior: Correcting a Reductionist Bias , 2017, Neuron.
[68] Joel Z. Leibo,et al. Multi-agent Reinforcement Learning in Sequential Social Dilemmas , 2017, AAMAS.
[69] Tom Schaul,et al. Reinforcement Learning with Unsupervised Auxiliary Tasks , 2016, ICLR.
[70] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[71] Guillaume Lample,et al. Playing FPS Games with Deep Reinforcement Learning , 2016, AAAI.
[72] Patrick MacAlpine,et al. UT Austin Villa: RoboCup 2016 3D Simulation League Competition and Technical Challenges Champions , 2015, Robot Soccer World Cup.
[73] Shimon Whiteson,et al. Learning with Opponent-Learning Awareness , 2017, AAMAS.
[74] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.
[75] Shane Legg,et al. IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.
[76] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.
[77] P. Alam. ‘T’ , 2021, Composites Engineering: An A–Z Guide.
[78] P. Alam. ‘N’ , 2021, Composites Engineering: An A–Z Guide.
[79] P. Alam. ‘S’ , 2021, Composites Engineering: An A–Z Guide.