Learning Multi-Arm Manipulation Through Collaborative Teleoperation

Imitation Learning (IL) is a powerful paradigm to teach robots to perform manipulation tasks by allowing them to learn from human demonstrations collected via teleoperation, but has mostly been limited to single-arm manipulation. However, many real-world tasks require multiple arms, such as lifting a heavy object or assembling a desk. Unfortunately, applying IL to multi-arm manipulation tasks has been challenging -- asking a human to control more than one robotic arm can impose significant cognitive burden and is often only possible for a maximum of two robot arms. To address these challenges, we present Multi-Arm RoboTurk (MART), a multi-user data collection platform that allows multiple remote users to simultaneously teleoperate a set of robotic arms and collect demonstrations for multi-arm tasks. Using MART, we collected demonstrations for five novel two and three-arm tasks from several geographically separated users. From our data we arrived at a critical insight: most multi-arm tasks do not require global coordination throughout its full duration, but only during specific moments. We show that learning from such data consequently presents challenges for centralized agents that directly attempt to model all robot actions simultaneously, and perform a comprehensive study of different policy architectures with varying levels of centralization on our tasks. Finally, we propose and evaluate a base-residual policy framework that allows trained policies to better adapt to the mixed coordination setting common in multi-arm manipulation, and show that a centralized policy augmented with a decentralized residual model outperforms all other models on our set of benchmark tasks. Additional results and videos at this https URL .

[1]  Yi Wu,et al.  Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments , 2017, NIPS.

[2]  Tamim Asfour,et al.  Programming by demonstration: dual-arm manipulation tasks for humanoid robots , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[3]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[4]  Byron Boots,et al.  IRIS: Implicit Reinforcement without Interaction at Scale for Learning Control from Offline Robot Manipulation Data , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Manuel G. Catalano,et al.  Shared-Autonomy Control for Intuitive Bimanual Tele-Manipulation , 2018, 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids).

[6]  M. Veloso,et al.  Multiagent Collaborative Task Learning through Imitation , 2007 .

[7]  Daniela Rus,et al.  Baxter's Homunculus: Virtual Reality Spaces for Teleoperation in Manufacturing , 2017, IEEE Robotics and Automation Letters.

[8]  P. Alam,et al.  R , 1823, The Herodotus Encyclopedia.

[9]  Shimon Whiteson,et al.  Learning to Communicate with Deep Multi-Agent Reinforcement Learning , 2016, NIPS.

[10]  Abhinav Gupta,et al.  Efficient Bimanual Manipulation Using Learned Task Schemas , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[11]  Stefan Lee,et al.  Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  A. Fischer Inverse Reinforcement Learning , 2012 .

[13]  Darwin G. Caldwell,et al.  Learning bimanual end-effector poses from demonstrations using task-parameterized dynamical systems , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[14]  Stefano Ermon,et al.  Multi-Agent Generative Adversarial Imitation Learning , 2018, NeurIPS.

[15]  Bart De Schutter,et al.  A Comprehensive Survey of Multiagent Reinforcement Learning , 2008, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).

[16]  Oussama Khatib,et al.  A unified approach for motion and force control of robot manipulators: The operational space formulation , 1987, IEEE J. Robotics Autom..

[17]  Shimon Whiteson,et al.  Counterfactual Multi-Agent Policy Gradients , 2017, AAAI.

[18]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[19]  Oliver Kroemer,et al.  Towards learning hierarchical skills for multi-phase manipulation tasks , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[20]  Heinz Koeppl,et al.  Inverse Reinforcement Learning in Swarm Systems , 2016, AAMAS.

[21]  Pieter Abbeel,et al.  Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[22]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[23]  Yavuz Akbulut,et al.  Effect of multitasking, physical environment and electroencephalography use on cognitive load and retention , 2019, Comput. Hum. Behav..

[24]  Carme Torras,et al.  Exploiting Symmetries in Reinforcement Learning of Bimanual Robotic Tasks , 2019, IEEE Robotics and Automation Letters.

[25]  Danica Kragic,et al.  Dual arm manipulation - A survey , 2012, Robotics Auton. Syst..

[26]  Rob Fergus,et al.  Learning Multiagent Communication with Backpropagation , 2016, NIPS.

[27]  Yisong Yue,et al.  Coordinated Multi-Agent Imitation Learning , 2017, ICML.

[28]  Aude Billard,et al.  Combining Dynamical Systems control and programming by demonstration for teaching discrete bimanual coordination tasks to a humanoid robot , 2008, 2008 3rd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[29]  Charles C. Kemp,et al.  Two Arms Are Better Than One: A Behavior Based Control System for Assistive Bimanual Manipulation , 2007 .

[30]  Silvio Savarese,et al.  Learning to Generalize Across Long-Horizon Tasks from Human Demonstrations , 2020, Robotics: Science and Systems.

[31]  Zongqing Lu,et al.  Learning Attentional Communication for Multi-Agent Cooperation , 2018, NeurIPS.

[32]  Silvio Savarese,et al.  ROBOTURK: A Crowdsourcing Platform for Robotic Skill Learning through Imitation , 2018, CoRL.

[33]  Ken Goldberg,et al.  Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.

[34]  Prashant Doshi,et al.  Multi-robot inverse reinforcement learning under occlusion with interactions , 2014, AAMAS.

[35]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  Saurabh Gupta,et al.  Intrinsic Motivation for Encouraging Synergistic Behavior , 2020, ICLR.

[37]  Roberto Mart'in-Mart'in,et al.  robosuite: A Modular Simulation Framework and Benchmark for Robot Learning , 2020, ArXiv.

[38]  Peng Peng,et al.  Multiagent Bidirectionally-Coordinated Nets: Emergence of Human-level Coordination in Learning to Play StarCraft Combat Games , 2017, 1703.10069.