Reusable neural skill embeddings for vision-guided whole body movement and object manipulation

Both in simulation settings and robotics, there is an ambition to produce flexible control systems that can enable complex bodies to perform dynamic locomotion and natural object manipulation. In previous work, we developed a framework to train locomotor skills and reuse these skills for whole-body visuomotor tasks. Here, we extend this line of work to tasks involving whole body movement as well as visually guided manipulation of objects. This setting poses novel challenges in terms of task specification, exploration, and generalization. We develop an integrated approach consisting of a flexible motor primitive module, demonstrations, an instructed training regime as well as curricula in the form of task variations. We demonstrate the utility of our approach for solving challenging whole body tasks that require joint locomotion and manipulation, and characterize its behavioral robustness. We also provide a high-level overview video, see this https URL .

[1]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Stefan Schaal,et al.  Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[3]  Joonho Lee,et al.  Learning agile and dynamic motor skills for legged robots , 2019, Science Robotics.

[4]  Jonathan W. Hurst,et al.  Iterative Reinforcement Learning Based Design of Dynamic Locomotion Skills for Cassie , 2019, ArXiv.

[5]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[6]  Zoran Popovic,et al.  Discovery of complex behaviors through contact-invariant optimization , 2012, ACM Trans. Graph..

[7]  Javier R. Movellan,et al.  STAC: Simultaneous tracking and calibration , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[8]  Peter Englert,et al.  Learning manipulation skills from a single demonstration , 2018, Int. J. Robotics Res..

[9]  Sergey Levine,et al.  Learning Latent Plans from Play , 2019, CoRL.

[10]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Aude Billard,et al.  Catching Objects in Flight , 2014, IEEE Transactions on Robotics.

[12]  Atil Iscen,et al.  Sim-to-Real: Learning Agile Locomotion For Quadruped Robots , 2018, Robotics: Science and Systems.

[13]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[14]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[15]  Jan Peters,et al.  Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[16]  Scott Niekum,et al.  Learning and generalization of complex tasks from unstructured demonstrations , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Yuval Tassa,et al.  Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.

[18]  Sergey Levine,et al.  Learning Dexterous Manipulation Policies from Experience and Imitation , 2016, ArXiv.

[19]  H. Francis Song,et al.  V-MPO: On-Policy Maximum a Posteriori Policy Optimization for Discrete and Continuous Control , 2019, ICLR.

[20]  Razvan Pascanu,et al.  Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[21]  Stefan Schaal,et al.  Robot Learning From Demonstration , 1997, ICML.

[22]  Demis Hassabis,et al.  A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play , 2018, Science.

[23]  Joonho Lee,et al.  Robust Recovery Controller for a Quadrupedal Robot using Deep Reinforcement Learning , 2019, ArXiv.

[24]  Oussama Khatib,et al.  Synthesis of Whole-Body Behaviors through Hierarchical Control of Behavioral Primitives , 2005, Int. J. Humanoid Robotics.

[25]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[26]  Yuval Tassa,et al.  Maximum a Posteriori Policy Optimisation , 2018, ICLR.

[27]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[28]  Sergey Levine,et al.  MCP: Learning Composable Hierarchical Control with Multiplicative Compositional Policies , 2019, NeurIPS.

[29]  Maja J. Mataric,et al.  Automated derivation of behavior vocabularies for autonomous humanoid motion , 2003, AAMAS '03.

[30]  Nando de Freitas,et al.  Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.

[31]  Aude Billard,et al.  Reinforcement learning for imitating constrained reaching movements , 2007, Adv. Robotics.

[32]  Sergey Levine,et al.  Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[33]  Stefan Schaal,et al.  2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .

[34]  Sergey Levine,et al.  DeepMimic , 2018, ACM Trans. Graph..

[35]  Alberto Rodriguez,et al.  TossingBot: Learning to Throw Arbitrary Objects With Residual Physics , 2019, IEEE Transactions on Robotics.

[36]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[37]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[38]  Leslie Pack Kaelbling,et al.  Effective reinforcement learning for mobile robots , 2002, Proceedings 2002 IEEE International Conference on Robotics and Automation (Cat. No.02CH37292).

[39]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[40]  Andrew Y. Ng,et al.  Pharmacokinetics of a novel formulation of ivermectin after administration to goats , 2000, ICML.

[41]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Stefan Schaal,et al.  Skill learning and task outcome prediction for manipulation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[43]  Yuval Tassa,et al.  Learning and Transfer of Modulated Locomotor Controllers , 2016, ArXiv.

[44]  Yuval Tassa,et al.  Emergence of Locomotion Behaviours in Rich Environments , 2017, ArXiv.

[45]  Stefan Schaal,et al.  Learning from Demonstration , 1996, NIPS.

[46]  Nicolas Heess,et al.  Hierarchical visuomotor control of humanoids , 2018, ICLR.

[47]  Pieter Abbeel,et al.  Learning for control from multiple demonstrations , 2008, ICML '08.

[48]  Martin A. Riedmiller,et al.  Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.

[49]  Glen Berseth,et al.  Terrain-adaptive locomotion skills using deep reinforcement learning , 2016, ACM Trans. Graph..

[50]  Shane Legg,et al.  IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures , 2018, ICML.

[51]  Yee Whye Teh,et al.  Neural probabilistic motor primitives for humanoid control , 2018, ICLR.

[52]  Kazuya Otani,et al.  Adaptive whole-body manipulation in human-to-humanoid multi-contact motion retargeting , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).