论文信息 - Demonstration-Guided Deep Reinforcement Learning of Control Policies for Dexterous Human-Robot Interaction

Demonstration-Guided Deep Reinforcement Learning of Control Policies for Dexterous Human-Robot Interaction

In this paper, we propose a method for training control policies for human-robot interactions such as handshakes or hand claps via Deep Reinforcement Learning. The policy controls a humanoid Shadow Dexterous Hand, attached to a robot arm. We propose a parameterizable multi-objective reward function that allows learning of a variety of interactions without changing the reward structure. The parameters of the reward function are estimated directly from motion capture data of human-human interactions in order to produce policies that are perceived as being natural and human-like by observers. We evaluate our method on three significantly different hand interactions: handshake, hand clap and finger touch. We provide detailed analysis of the proposed reward function and the resulting policies and conduct a large-scale user study, indicating that our policy produces natural looking motions.

Otmar Hilliges | Sammy Christen | Stefan Stevsic

[1] Markus Wulfmeier,et al. Maximum Entropy Deep Inverse Reinforcement Learning , 2015, 1507.04888.

[2] OpenAI. Learning Dexterous In-Hand Manipulation. , 2018 .

[3] Paul A. Beardsley,et al. Handshakiness: Benchmarking for human-robot hand interactions , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4] Sergey Levine,et al. Optimal control with learned local models: Application to dexterous manipulation , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[5] Alec Radford,et al. Proximal Policy Optimization Algorithms , 2017, ArXiv.

[6] Vikash Kumar,et al. Fast, strong and compliant pneumatic actuation for dexterous tendon-driven hands , 2013, 2013 IEEE International Conference on Robotics and Automation.

[7] Clément Gosselin,et al. Design, control and experimental validation of a haptic robotic hand performing human-robot handshake with human-like agility , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[8] Yuval Tassa,et al. Real-time behaviour synthesis for dynamic hand-manipulation , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[9] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[10] Zoran Popovic,et al. Contact-invariant optimization for hand manipulation , 2012, SCA '12.

[11] Anind K. Dey,et al. Maximum Entropy Inverse Reinforcement Learning , 2008, AAAI.

[12] Jakub W. Pachocki,et al. Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[13] Tohru Sasaki,et al. Handshake request motion model with an approaching human for a handshake robot system , 2015, 2015 IEEE 7th International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE Conference on Robotics, Automation and Mechatronics (RAM).

[14] Bernhard Thomaszewski,et al. Proceedings of the ACM SIGGRAPH / Eurographics Symposium on Computer Animation , 2017, SCA 2017.

[15] Adriana Tapus,et al. Let's handshake and I'll know who you are: Gender and personality discrimination in human-human and human-robot handshaking interaction , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[16] Sergey Levine,et al. Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[17] Yuval Tassa,et al. Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.

[18] Martin A. Riedmiller,et al. Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards , 2017, ArXiv.

[19] C. Karen Liu,et al. Dexterous manipulation using both palm and fingers , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[20] Razvan Pascanu,et al. Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[21] Sergey Levine,et al. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[22] Stefano Ermon,et al. Generative Adversarial Imitation Learning , 2016, NIPS.

[23] Pieter Abbeel,et al. Apprenticeship learning via inverse reinforcement learning , 2004, ICML.

[24] Wojciech Zaremba,et al. OpenAI Gym , 2016, ArXiv.

[25] Sergey Levine,et al. DeepMimic , 2018, ACM Trans. Graph..

[26] Michael S. Ryoo,et al. Learning social affordance grammar from videos: Transferring human interactions to human-robot interactions , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[27] Yuval Tassa,et al. Continuous control with deep reinforcement learning , 2015, ICLR.