论文信息 - One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

One-Shot Imitation from Observing Humans via Domain-Adaptive Meta-Learning

Humans and animals are capable of learning a new behavior by observing others perform the skill just once. We consider the problem of allowing a robot to do the same -- learning from a raw video pixels of a human, even when there is substantial domain shift in the perspective, environment, and embodiment between the robot and the observed human. Prior approaches to this problem have hand-specified how human and robot actions correspond and often relied on explicit human pose detection systems. In this work, we present an approach for one-shot learning from a video of a human by using human and robot demonstration data from a variety of previous tasks to build up prior knowledge through meta-learning. Then, combining this prior knowledge and only a single video demonstration from a human, the robot can perform the task that the human demonstrated. We show experiments on both a PR2 arm and a Sawyer arm, demonstrating that after meta-learning, the robot can learn to place, push, and pick-and-place new objects using just one video of a human performing the manipulation.

[1] Dean Pomerleau,et al. ALVINN, an autonomous land vehicle in a neural network , 2015 .

[2] S. Srihari. Mixture Density Networks , 1994 .

[3] Reginaldo J. Santos. Equivalence of regularization and truncated iteration for general ill-posed problems☆ , 1996 .

[4] Ran,et al. The correspondence problem , 1998 .

[5] Stefan Schaal,et al. Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[6] Danica Kragic,et al. Interactive grasp learning based on human demonstration , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[7] Rüdiger Dillmann,et al. Teaching and learning of robot tasks via observation of human performance , 2004, Robotics Auton. Syst..

[8] M. Brass,et al. Imitation: is cognitive neuroscience solving the correspondence problem? , 2005, Trends in Cognitive Sciences.

[9] Aude Billard,et al. Teaching a Humanoid Robot to Recognize and Reproduce Social Cues , 2006, ROMAN 2006 - The 15th IEEE International Symposium on Robot and Human Interactive Communication.

[10] Danica Kragic,et al. Visual recognition of grasps for human-to-robot mapping , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11] Paul Evrard,et al. Learning collaborative manipulation tasks by demonstration using a haptic interface , 2009, ICAR.

[12] Stefan Schaal,et al. Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[13] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[14] Danica Kragic,et al. Learning Actions from Observations , 2010, IEEE Robotics & Automation Magazine.

[15] Andrew Zisserman,et al. Tabula rasa: Model transfer for object category detection , 2011, 2011 International Conference on Computer Vision.

[16] Stefan Schaal,et al. Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17] P. Cochat,et al. Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[18] Oliver Kroemer,et al. Learning to select and generalize striking movements in robot table tennis , 2012, AAAI Fall Symposium: Robots Learning Interactively from Human Teachers.

[19] Joshua B. Tenenbaum,et al. One-Shot Learning with a Hierarchical Nonparametric Bayesian Model , 2011, ICML Unsupervised and Transfer Learning.

[20] Yuval Tassa,et al. MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21] Maya Cakmak,et al. Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[22] Tinne Tuytelaars,et al. Unsupervised Visual Domain Adaptation Using Subspace Alignment , 2013, 2013 IEEE International Conference on Computer Vision.

[23] Tae-Kyun Kim,et al. A syntactic approach to robot imitation learning using probabilistic activity grammars , 2013, Robotics Auton. Syst..

[24] Kristen Grauman,et al. Connecting the Dots with Landmarks: Discriminatively Learning Domain-Invariant Features for Unsupervised Domain Adaptation , 2013, ICML.

[25] Trevor Darrell,et al. Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[26] Martial Hebert,et al. Autonomy Infused Teleoperation with Application to BCI Manipulation , 2015, Robotics: Science and Systems.

[27] Rama Chellappa,et al. Visual Domain Adaptation: A survey of recent advances , 2015, IEEE Signal Processing Magazine.

[28] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[29] Yi Li,et al. Robot Learning Manipulation Action Plans by "Watching" Unconstrained Videos from the World Wide Web , 2015, AAAI.

[30] Sergey Levine,et al. End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[31] Sergey Levine,et al. Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[32] A. Behal,et al. Learning real manipulation tasks from virtual demonstrations using LSTM , 2016 .

[33] Heiga Zen,et al. WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[34] Kate Saenko,et al. Learning a visuomotor controller for real world robotic grasping using simulated depth images , 2017, CoRL.

[35] Dumitru Erhan,et al. Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36] Pieter Abbeel,et al. Third-Person Imitation Learning , 2017, ICLR.

[37] Byron Boots,et al. Towards Robust Skill Generalization: Unifying Learning from Demonstration and Motion Planning , 2017, CoRL.

[38] Nicholas Rhinehart,et al. First-Person Activity Forecasting with Online Inverse Reinforcement Learning , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[39] Sergey Levine,et al. One-Shot Visual Imitation Learning via Meta-Learning , 2017, CoRL.

[40] C A Nelson,et al. Learning to Learn , 2017, Encyclopedia of Machine Learning and Data Mining.

[41] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.

[42] Tomas Pfister,et al. Learning from Simulated and Unsupervised Images through Adversarial Training , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43] Sergey Levine,et al. Unsupervised Perceptual Rewards for Imitation Learning , 2016, Robotics: Science and Systems.

[44] Cewu Lu,et al. Virtual to Real Reinforcement Learning for Autonomous Driving , 2017, BMVC.

[45] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Multi-view Observation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[46] Gordon Cheng,et al. Transferring skills to humanoid robots by extracting semantic representations from observations of human activities , 2017, Artif. Intell..

[47] Matthew R. Walter,et al. Satellite image-based localization via learned embeddings , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[48] Sergey Levine,et al. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[49] Michael Milford,et al. What Would You Do? Acting by Learning to Predict , 2017, IROS 2017.

[50] Michael S. Ryoo,et al. Learning robot activities from first-person human videos using convolutional future regression , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[51] Sergey Levine,et al. (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[52] Dieter Fox,et al. PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes , 2017, Robotics: Science and Systems.

[53] Sergey Levine,et al. Time-Contrastive Networks: Self-Supervised Learning from Video , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[54] Thomas L. Griffiths,et al. Recasting Gradient-Based Meta-Learning as Hierarchical Bayes , 2018, ICLR.

[55] Sergey Levine,et al. Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[56] Eren Erdal Aksoy,et al. Deep Episodic Memory: Encoding, Recalling, and Predicting Episodic Experiences for Robot Action Execution , 2018, IEEE Robotics and Automation Letters.

[57] Rouhollah Rahmatizadeh,et al. Vision-Based Multi-Task Manipulation for Inexpensive Robots Using End-to-End Learning from Demonstration , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[58] Yongxin Yang,et al. Learning to Generalize: Meta-Learning for Domain Generalization , 2017, AAAI.

[59] Ken Goldberg,et al. Deep Imitation Learning for Complex Manipulation Tasks from Virtual Reality Teleoperation , 2017, ICRA.

[60] Ian Taylor,et al. Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[61] Wolfram Burgard,et al. Socially Compliant Navigation Through Raw Depth Inputs with Generative Adversarial Imitation Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[62] Nikolaos G. Tsagarakis,et al. Translating Videos to Commands for Robotic Manipulation with Deep Recurrent Neural Networks , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).