论文信息 - Learning to Coordinate Manipulation Skills via Skill Behavior Diversification

Learning to Coordinate Manipulation Skills via Skill Behavior Diversification

When mastering a complex manipulation task, humans often decompose the task into sub-skills of their body parts, practice the sub-skills independently, and then execute the sub-skills together. Similarly, a robot with multiple end-effectors can perform a complex task by coordinating sub-skills of each end-effector. To realize temporal and behavioral coordination of skills, we propose a hierarchical framework that first individually trains sub-skills of each end-effector with skill behavior diversification, and learns to coordinate end-effectors using diverse behaviors of the skills. We demonstrate that our proposed framework is able to efficiently learn sub-skills with diverse behaviors and coordinate them to solve challenging collaborative control tasks such as picking up a long bar, placing a block inside a container while pushing the container with two robot arms, and pushing a box with two ant agents.

Joseph J. Lim | Jingyun Yang | Youngwoon Lee | Jingyun Yang | Youngwoon Lee

[1] Shimon Whiteson,et al. Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[2] Yi Wu,et al. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[3] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[4] Sergey Levine,et al. Reinforcement Learning with Competitive Ensembles of Information-Constrained Primitives , 2019, ICLR.

[5] Martin A. Riedmiller,et al. Learning by Playing - Solving Sparse Reward Tasks from Scratch , 2018, ICML.

[6] Sergey Levine,et al. Reinforcement Learning with Deep Energy-Based Policies , 2017, ICML.

[7] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8] Sergey Levine,et al. MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies , 2019, NeurIPS.

[9] Joseph J. Lim,et al. Composing Complex Skills by Learning Transition Policies , 2018, ICLR.

[10] Pieter Abbeel,et al. Meta Learning Shared Hierarchies , 2017, ICLR.

[11] Kate Saenko,et al. Hierarchical Actor-Critic , 2017, ArXiv.

[12] Honglak Lee,et al. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning , 2017, ICML.

[13] Byron Boots,et al. Composing Ensembles of Policies with Deep Reinforcement Learning , 2019, ArXiv.

[14] Sergey Levine,et al. Composable Deep Reinforcement Learning for Robotic Manipulation , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[15] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[16] Guy Lever,et al. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward , 2018, AAMAS.

[17] Karol Hausman,et al. Learning an Embedding Space for Transferable Robot Skills , 2018, ICLR.

[18] Dan Klein,et al. Modular Multitask Reinforcement Learning with Policy Sketches , 2016, ICML.

[19] Joshua B. Tenenbaum,et al. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation , 2016, NIPS.