暂无分享,去创建一个
Yuval Tassa | Guy Lever | Thore Graepel | Karl Tuyls | Abbas Abdolmaleki | Nicolas Heess | Markus Wulfmeier | Leonard Hasenclever | Daniel Hennes | Brendan D. Tracey | S. M. Ali Eslami | Shayegan Omidshafiei | Saran Tunyasuvunakool | Josh Merel | Tuomas Haarnoja | Paul Muller | Noah Y. Siegel | Luke Marris | Siqi Liu | Zhe Wang | Wojciech M. Czarnecki | H. Francis Song | N. Heess | Yuval Tassa | J. Merel | S. Tunyasuvunakool | Siqi Liu | Markus Wulfmeier | A. Abdolmaleki | Noah Siegel | Leonard Hasenclever | H. F. Song | Tuomas Haarnoja | S. Eslami | Guy Lever | T. Graepel | K. Tuyls | Luke Marris | Shayegan Omidshafiei | Daniel Hennes | Zhe Wang | Paul Muller
[1] K. Lashley. The problem of serial order in behavior , 1951 .
[2] L. Shapley,et al. Stochastic Games* , 1953, Proceedings of the National Academy of Sciences.
[3] Arthur L. Samuel,et al. Some Studies in Machine Learning Using the Game of Checkers , 1967, IBM J. Res. Dev..
[4] Roger C. Schank,et al. Scripts, plans, goals and understanding: an inquiry into human knowledge structures , 1978 .
[5] A. Elo. The rating of chessplayers, past and present , 1978 .
[6] J. Krebs,et al. Arms races between and within species , 1979, Proceedings of the Royal Society of London. Series B. Biological Sciences.
[7] Rodney A. Brooks,et al. A Robust Layered Control Syste For A Mobile Robot , 2022 .
[8] Marc H. Raibert,et al. Legged Robots That Balance , 1986, IEEE Expert.
[9] David A. Rosenbaum,et al. Hierarchical organization of motor programs. , 1987 .
[10] Micha Sharir,et al. Algorithmic motion planning in robotics , 1991, Computer.
[11] Richard Reviewer-Granger. Unified Theories of Cognition , 1991, Journal of Cognitive Neuroscience.
[12] J. Urgen Schmidhuber. Neural Sequence Chunkers , 1991 .
[13] Subbarao Kambhampati,et al. Combining Specialized Reasoners and General Purpose Planners: A Case Study , 1991, AAAI.
[14] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[15] Robert J. Crutcher,et al. The role of deliberate practice in the acquisition of expert performance. , 1993 .
[16] James S. Albus,et al. A reference model architecture for intelligent systems design , 1993 .
[17] Karl Sims,et al. Evolving virtual creatures , 1994, SIGGRAPH.
[18] Gerald Tesauro,et al. Temporal Difference Learning and TD-Gammon , 1995, J. Int. Comput. Games Assoc..
[19] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[20] Stuart J. Russell,et al. Reinforcement Learning with Hierarchies of Machines , 1997, NIPS.
[21] Hiroaki Kitano,et al. RoboCup: The Robot World Cup Initiative , 1997, AGENTS '97.
[22] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[23] Craig Boutilier,et al. The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems , 1998, AAAI/IAAI.
[24] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[25] Manuela M. Veloso,et al. Team-partitioned, opaque-transition reinforcement learning , 1999, AGENTS '99.
[26] Neil Immerman,et al. The Complexity of Decentralized Control of Markov Decision Processes , 2000, UAI.
[27] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[28] Martin Lauer,et al. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems , 2000, ICML.
[29] Peter Stone,et al. Layered learning in multiagent systems - a winning approach to robotic soccer , 2000, Intelligent robotics and autonomous agents.
[30] Martin A. Riedmiller,et al. Karlsruhe Brainstormers - A Reinforcement Learning Approach to Robotic Soccer , 2000, RoboCup.
[31] J. Fuster. The Prefrontal Cortex—An Update Time Is of the Essence , 2001, Neuron.
[32] K. Tuyls,et al. Reinforcement Learning in Large State Spaces , 2002, RoboCup.
[33] René Boel,et al. Discrete event dynamic systems: Theory and applications. , 2002 .
[34] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[35] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[36] Peter Stone,et al. Machine Learning for Fast Quadrupedal Locomotion , 2004, AAAI.
[37] Sean Luke,et al. Cooperative Multi-Agent Learning: The State of the Art , 2005, Autonomous Agents and Multi-Agent Systems.
[38] Peter Stone,et al. Reinforcement Learning for RoboCup Soccer Keepaway , 2005, Adapt. Behav..
[39] C. Koch,et al. Invariant visual representation by single neurons in the human brain , 2005, Nature.
[40] Peter Stone,et al. The Chin Pinch: A Case Study in Skill Learning on a Legged Robot , 2006, RoboCup.
[41] S. Bennett,et al. Observational Modeling Effects for Movement Dynamics and Movement Outcome Measures Across Differing Task Constraints: A Meta-Analysis , 2006, Journal of motor behavior.
[42] H. Bekkering,et al. Joint action: bodies and minds moving together , 2006, Trends in Cognitive Sciences.
[43] Peter Stone,et al. Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study , 2006, RoboCup.
[44] Peter Stone,et al. Autonomous Learning of Stable Quadruped Locomotion , 2006, RoboCup.
[45] KangKang Yin,et al. SIMBICON: simple biped locomotion control , 2007, ACM Trans. Graph..
[46] Peter Stone,et al. Model-Based Reinforcement Learning in a Complex Domain , 2008, RoboCup.
[47] C. Koch,et al. Sparse but not ‘Grandmother-cell’ coding in the medial temporal lobe , 2008, Trends in Cognitive Sciences.
[48] Bart De Schutter,et al. A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[49] Martin Lauer,et al. Learning to dribble on a real robot by success and failure , 2008, 2008 IEEE International Conference on Robotics and Automation.
[50] Peter Stone,et al. Learning Complementary Multiagent Behaviors: A Case Study , 2009, RoboCup.
[51] Martin A. Riedmiller,et al. Reinforcement learning for robot soccer , 2009, Auton. Robots.
[52] M. van de Panne,et al. Generalized biped walking control , 2010, ACM Trans. Graph..
[53] Martin A. Riedmiller,et al. On Progress in RoboCup: The Simulation League Showcase , 2010, RoboCup.
[54] Martin Lauer,et al. Cognitive concepts in autonomous soccer playing robots , 2010, Cognitive Systems Research.
[55] Peter Stone,et al. Learning Powerful Kicks on the Aibo ERS-7: The Quest for a Striker , 2010, RoboCup.
[56] N. Le Fort-Piat,et al. The world of independent learners is not markovian , 2011, Int. J. Knowl. Based Intell. Eng. Syst..
[57] Daniel Urieli,et al. On optimizing interdependent skills: a case study in simulated 3D humanoid robot soccer , 2011, AAMAS.
[58] Gerhard Weiss,et al. Multiagent Learning: Basics, Challenges, and Prospects , 2012, AI Mag..
[59] Eli M. Swanson,et al. Evolution of Cooperation among Mammalian Carnivores and Its Relevance to Hominin Evolution , 2012, Current Anthropology.
[60] Yuval Tassa,et al. Synthesis and stabilization of complex behaviors through online trajectory optimization , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[61] Jan Peters,et al. Hierarchical Relative Entropy Policy Search , 2014, AISTATS.
[62] Guillaume J. Laurent,et al. Independent reinforcement learners in cooperative Markov games: a survey regarding coordination problems , 2012, The Knowledge Engineering Review.
[63] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.
[64] A. Williams,et al. Developmental activities and the acquisition of superior anticipation and decision making in soccer players , 2012, Journal of sports sciences.
[65] Jan Peters,et al. Probabilistic Movement Primitives , 2013, NIPS.
[66] Javier R. Movellan,et al. STAC: Simultaneous tracking and calibration , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).
[67] Patrick MacAlpine,et al. Humanoid robots learning to walk faster: from the real world to simulation and back , 2013, AAMAS.
[68] J. Baker,et al. 20 years later: deliberate practice and the development of expertise in sport , 2014 .
[69] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.
[70] Paul Sajda,et al. Knowing when not to swing: EEG evidence that enhanced perception–action coupling underlies baseball batter expertise , 2015, NeuroImage.
[71] Eder Gonçalves,et al. Anticipation in Soccer: A Systematic Review , 2015 .
[72] J. Diedrichsen,et al. Motor skill learning between selection and execution , 2015, Trends in Cognitive Sciences.
[73] Zoran Popovic,et al. Interactive Control of Diverse Complex Characters with Neural Networks , 2015, NIPS.
[74] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.
[75] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.
[76] Shimon Whiteson,et al. Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.
[77] Rob Fergus,et al. Learning Multiagent Communication with Backpropagation , 2016, NIPS.
[78] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.
[79] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.
[80] Peter Stone,et al. Deep Reinforcement Learning in Parameterized Action Space , 2015, ICLR.
[81] Sergey Levine,et al. High-Dimensional Continuous Control Using Generalized Advantage Estimation , 2015, ICLR.
[82] Marc G. Bellemare,et al. Safe and Efficient Off-Policy Reinforcement Learning , 2016, NIPS.
[83] Yuval Tassa,et al. Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.
[84] Scott Kuindersma,et al. Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot , 2015, Autonomous Robots.
[85] Tom Schaul,et al. FeUdal Networks for Hierarchical Reinforcement Learning , 2017, ICML.
[86] Kevin Waugh,et al. DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker , 2017, ArXiv.
[87] Ion Stoica,et al. Multi-Level Discovery of Deep Options , 2017, ArXiv.
[88] Doina Precup,et al. The Option-Critic Architecture , 2016, AAAI.
[89] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.
[90] Glen Berseth,et al. DeepLoco: dynamic locomotion skills using hierarchical deep reinforcement learning , 2017, ACM Trans. Graph..
[91] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.
[92] Max Jaderberg,et al. Population Based Training of Neural Networks , 2017, ArXiv.
[93] Daan Wierstra,et al. Variational Intrinsic Control , 2016, ICLR.
[94] Yee Whye Teh,et al. Distral: Robust multitask reinforcement learning , 2017, NIPS.
[95] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.
[96] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.
[97] Ion Stoica,et al. DDCO: Discovery of Deep Continuous Options for Robot Learning from Demonstrations , 2017, CoRL.
[98] Libin Liu,et al. Learning basketball dribbling skills using trajectory optimization and deep reinforcement learning , 2018, ACM Trans. Graph..
[99] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..
[100] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.
[101] Sergey Levine,et al. Data-Efficient Hierarchical Reinforcement Learning , 2018, NeurIPS.
[102] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.
[103] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.
[104] Pieter Abbeel,et al. Emergence of Grounded Compositional Language in Multi-Agent Populations , 2017, AAAI.
[105] Pieter Abbeel,et al. Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments , 2017, ICLR.
[106] Patrick MacAlpine,et al. Overlapping layered learning , 2018, Artif. Intell..
[107] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.
[108] Jakub W. Pachocki,et al. Emergent Complexity via Multi-Agent Competition , 2017, ICLR.
[109] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.
[110] Sergey Levine,et al. Latent Space Policies for Hierarchical Reinforcement Learning , 2018, ICML.
[111] Thore Graepel,et al. Re-evaluating evaluation , 2018, NeurIPS.
[112] N. Heess,et al. Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks , 2019 .
[113] Joel Z. Leibo,et al. Autocurricula and the Emergence of Innovation from Social Interaction: A Manifesto for Multi-Agent Intelligence Research , 2019, ArXiv.
[114] J. Forbes,et al. DReCon: data-driven responsive control of physics-based characters , 2019, ACM Trans. Graph..
[115] Max Jaderberg,et al. Open-ended Learning in Symmetric Zero-sum Games , 2019, ICML.
[116] Marcin Andrychowicz,et al. Solving Rubik's Cube with a Robot Hand , 2019, ArXiv.
[117] Wojciech M. Czarnecki,et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning , 2019, Nature.
[118] Matthew E. Taylor,et al. A survey and critique of multiagent deep reinforcement learning , 2019, Autonomous Agents and Multi-Agent Systems.
[119] Yee Whye Teh,et al. Information asymmetry in KL-regularized RL , 2019, ICLR.
[120] Kyoungmin Lee,et al. Scalable muscle-actuated human simulation and control , 2019, ACM Trans. Graph..
[121] Sergey Levine,et al. Diversity is All You Need: Learning Skills without a Reward Function , 2018, ICLR.
[122] Patrick MacAlpine,et al. UT Austin Villa: RoboCup 2019 3D Simulation League Competition and Technical Challenge Champions , 2019, RoboCup.
[123] Thomas Röfer,et al. B-Human 2019 - Complex Team Play Under Natural Lighting Conditions , 2019, RoboCup.
[124] Sergey Levine,et al. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies , 2019, NeurIPS.
[125] Guy Lever,et al. Emergent Coordination Through Competition , 2019, ICLR.
[126] Luís Paulo Reis,et al. Learning to Run Faster in a Humanoid Robot Soccer Environment Through Reinforcement Learning , 2019, RoboCup.
[127] Sergey Levine,et al. Relay Policy Learning: Solving Long-Horizon Tasks via Imitation and Reinforcement Learning , 2019, CoRL.
[128] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.
[129] Nicolas Heess,et al. Hierarchical visuomotor control of humanoids , 2018, ICLR.
[130] Greg Wayne,et al. Hierarchical motor control in mammals and machines , 2019, Nature Communications.
[131] Joonho Lee,et al. Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.
[132] Sergey Levine,et al. Near-Optimal Representation Learning for Hierarchical Reinforcement Learning , 2018, ICLR.
[133] Jonathan W. Hurst,et al. Iterative Reinforcement Learning Based Design of Dynamic Locomotion Skills for Cassie , 2019, ArXiv.
[134] Guy Lever,et al. Human-level performance in 3D multiplayer games with population-based reinforcement learning , 2018, Science.
[135] Sunmin Lee,et al. Learning predict-and-simulate policies from unorganized human motion data , 2019, ACM Trans. Graph..
[136] S. Levine,et al. Learning Agile Robotic Locomotion Skills by Imitating Animals , 2020, Robotics: Science and Systems.
[137] Max Jaderberg,et al. Real World Games Look Like Spinning Tops , 2020, NeurIPS.
[138] Igor Mordatch,et al. Emergent Tool Use From Multi-Agent Autocurricula , 2019, ICLR.
[139] Yuval Tassa,et al. dm_control: Software and Tasks for Continuous Control , 2020, Softw. Impacts.
[140] Martin A. Riedmiller,et al. Compositional Transfer in Hierarchical Reinforcement Learning , 2019, Robotics: Science and Systems.
[141] Garrett Warnell,et al. Reinforced Grounded Action Transformation for Sim-to-Real Transfer , 2020, ArXiv.
[142] Peter Stone,et al. Stochastic Grounded Action Transformation for Robot Learning in Simulation , 2017, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).
[143] O. Bousquet,et al. Google Research Football: A Novel Reinforcement Learning Environment , 2019, AAAI.
[144] Raia Hadsell,et al. CoMic: Complementary Task Learning & Mimicry for Reusable Skills , 2020, ICML.
[145] Yaodong Yang,et al. An Overview of Multi-Agent Reinforcement Learning from Game Theoretical Perspective , 2020, ArXiv.
[146] Yee Whye Teh,et al. Behavior Priors for Efficient Reinforcement Learning , 2020, J. Mach. Learn. Res..
[147] Lorenz Wellhausen,et al. Learning quadrupedal locomotion over challenging terrain , 2020, Science Robotics.
[148] Michal Valko,et al. Game Plan: What AI can do for Football, and What Football can do for AI , 2020, J. Artif. Intell. Res..
[149] Martin A. Riedmiller,et al. Data-efficient Hindsight Off-policy Option Learning , 2020, ICML.
[150] Weifeng Chen,et al. Learning to Sit: Synthesizing Human-Chair Interactions via Hierarchical Control , 2019, AAAI.