论文信息 - dm_control: Software and Tasks for Continuous Control

dm_control: Software and Tasks for Continuous Control

The dm_control software package is a collection of Python libraries and task suites for reinforcement learning agents in an articulated-body simulation. A MuJoCo wrapper provides convenient bindings to functions and data structures. The PyMJCF and Composer libraries enable procedural model manipulation and task authoring. The Control Suite is a fixed set of tasks with standardised structure, intended to serve as performance benchmarks. The Locomotion framework provides high-level abstractions and examples of locomotion tasks. A set of configurable manipulation tasks with a robot arm and snap-together bricks is also included. dm_control is publicly available at this https URL

[1] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.

[2] K. Doya,et al. A unifying computational framework for motor control and social interaction. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[3] Silvio Savarese,et al. SURREAL: Open-Source Reinforcement Learning Framework and Robot Manipulation Benchmark , 2018, CoRL.

[4] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[5] Philip Bachman,et al. Deep Reinforcement Learning that Matters , 2017, AAAI.

[6] Alexandre Campeau-Lecours,et al. Kinova Modular Robot Arms for Service Robotics Applications , 2017, Int. J. Robotics Appl. Technol..

[7] Andrew J. Davison,et al. RLBench: The Robot Learning Benchmark & Learning Environment , 2019, IEEE Robotics and Automation Letters.

[8] Shane Legg,et al. Human-level control through deep reinforcement learning , 2015, Nature.

[9] H. Francis Song,et al. A Distributional View on Multi-Objective Policy Optimization , 2020, ICML.

[10] Mark W. Spong,et al. The swing up control problem for the Acrobot , 1995 .

[11] Yuval Tassa,et al. Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.

[12] Pieter Abbeel,et al. Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[13] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.

[14] Tom Eccles,et al. Reinforcement Learning Agents acquire Flocking and Symbiotic Behaviour in Simulated Ecosystems , 2019, Artificial Life Conference Proceedings.

[15] Yuval Tassa,et al. Deep neuroethology of a virtual rodent , 2019, ICLR.

[16] Yuval Tassa,et al. Stochastic Complementarity for Local Control of Discontinuous Dynamics , 2010, Robotics: Science and Systems.

[17] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[18] Yuval Tassa,et al. Learning Continuous Control Policies by Stochastic Value Gradients , 2015, NIPS.

[19] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.

[20] Murilo F. Martins,et al. Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup , 2019, Robotics: Science and Systems.

[21] Yuval Tassa,et al. Maximum a Posteriori Policy Optimisation , 2018, ICLR.

[22] Nicolas Heess,et al. Hierarchical visuomotor control of humanoids , 2018, ICLR.

[23] Rémi Coulom,et al. Reinforcement Learning Using Neural Networks, with Applications to Motor Control. (Apprentissage par renforcement utilisant des réseaux de neurones, avec des applications au contrôle moteur) , 2002 .

[24] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.

[25] Marc G. Bellemare,et al. The Arcade Learning Environment: An Evaluation Platform for General Agents , 2012, J. Artif. Intell. Res..

[26] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27] Sergio Gomez Colmenarejo,et al. One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL , 2018, ArXiv.

[28] Guy Lever,et al. Emergent Coordination Through Competition , 2019, ICLR.

[29] Yee Whye Teh,et al. Neural probabilistic motor primitives for humanoid control , 2018, ICLR.

[30] David K. Smith,et al. Dynamic Programming and Optimal Control. Volume 1 , 1996 .

[31] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[32] Guy Lever,et al. The Body is Not a Given: Joint Agent Policy Learning and Morphology Evolution , 2019, AAMAS.

[33] Pawel Wawrzynski,et al. Real-time reinforcement learning by sequential Actor-Critics and experience replay , 2009, Neural Networks.

[34] Nando de Freitas,et al. Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.

[35] Raia Hadsell,et al. Success at any cost: value constrained model-free continuous control , 2018 .

[36] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[37] Yuval Tassa,et al. Simulation tools for model-based robotics: Comparison of Bullet, Havok, MuJoCo, ODE and PhysX , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[38] Demis Hassabis,et al. Mastering the game of Go with deep neural networks and tree search , 2016, Nature.