Deep Adversarial Imitation Learning of Locomotion Skills from One-shot Video Demonstration

Traditional imitation learning approaches usually collect demonstrations by teleoperation, kinesthetic teaching or precisely calibrated motion capture devices. These teaching interfaces are cumbersome and subject to the constraints of the environment and robot structures. Learning from observation adopts the idea that the robot can learn skills by observing human's behaviors, which is more convenient and preferable. However, learning from observation shows great challenges since it involves understanding of the environment and human actions, as well as solving the retarget problem. This paper presents a way to learn locomotion skills from a single video demonstration. We first leverage a weak supervised method to extract the pose feature from the experts, and then learn a joint position controller trying to match this feature by using the general adversarial network (GAN). This approach avoids cumbersome demonstrations, and more importantly, GAN can generalize learned skills to different subjects. We evaluated our method on a walking task executed by a 56-degree-of-freedom (DOF) humanoid robot. The experiment demonstrate that the vision-based imitation learning algorithm can be applied to high-dimensional robot task and achieve comparable performance to methods by using finely calibrated motion capture data, which are of great significance for the research on human-robot interaction and robot skill acquisition.

[1]  Michael J. Black,et al.  SMPL: A Skinned Multi-Person Linear Model , 2023 .

[2]  C. Stanton,et al.  Teleoperation of a humanoid robot using full-body motion capture , example movements , and machine learning , 2012 .

[3]  Ken Goldberg,et al.  Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.

[4]  Chelsea Finn,et al.  Unsupervised Visuomotor Control through Distributional Planning Networks , 2019, Robotics: Science and Systems.

[5]  Sergey Levine,et al.  One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning , 2018, Robotics: Science and Systems.

[6]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Aude Billard,et al.  Dynamical System Modulation for Robot Learning via Kinesthetic Demonstrations , 2008, IEEE Transactions on Robotics.

[8]  Sergey Levine,et al.  Few-Shot Goal Inference for Visuomotor Learning and Planning , 2018, CoRL.

[9]  Alec Radford,et al.  Proximal Policy Optimization Algorithms , 2017, ArXiv.

[10]  Sergey Levine,et al.  QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation , 2018, CoRL.

[11]  Murilo F. Martins,et al.  Simultaneously Learning Vision and Feature-based Control Policies for Real-world Ball-in-a-Cup , 2019, Robotics: Science and Systems.

[12]  Jitendra Malik,et al.  End-to-End Recovery of Human Shape and Pose , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Jitendra Malik,et al.  SFV , 2018, ACM Trans. Graph..

[14]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[15]  Sergey Levine,et al.  Self-Supervised Visual Planning with Temporal Skip Connections , 2017, CoRL.

[16]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[17]  Stefano Ermon,et al.  Generative Adversarial Imitation Learning , 2016, NIPS.

[18]  Aude Billard,et al.  Learning Compliant Manipulation through Kinesthetic and Tactile Human-Robot Interaction , 2014, IEEE Transactions on Haptics.

[19]  Nando de Freitas,et al.  Reinforcement and Imitation Learning for Diverse Visuomotor Skills , 2018, Robotics: Science and Systems.

[20]  Yuquan Leng,et al.  Motor Skills Learning and Generalization with Adapted Curvilinear Gaussian Mixture Model , 2019, J. Intell. Robotic Syst..

[21]  Dorthe Sølvason,et al.  Teleoperation for learning by demonstration: Data glove versus object manipulation for intuitive robot control , 2014, 2014 6th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT).

[22]  Yuval Tassa,et al.  Learning human behaviors from motion capture by adversarial imitation , 2017, ArXiv.

[23]  Xi Chen,et al.  Learning From Demonstration in the Wild , 2018, 2019 International Conference on Robotics and Automation (ICRA).