Combining self-supervised learning and imitation for vision-based rope manipulation

Manipulation of deformable objects, such as ropes and cloth, is an important but challenging problem in robotics. We present a learning-based system where a robot takes as input a sequence of images of a human manipulating a rope from an initial to goal configuration, and outputs a sequence of actions that can reproduce the human demonstration, using only monocular images as input. To perform this task, the robot learns a pixel-level inverse dynamics model of rope manipulation directly from images in a self-supervised manner, using about 60K interactions with the rope collected autonomously by the robot. The human demonstration provides a high-level plan of what to do and the low-level inverse model is used to execute the plan. We show that by combining the high and low-level plans, the robot can successfully manipulate a rope into a variety of target shapes using only a sequence of human-provided images for direction.

[1]  Masayuki INABA,et al.  Hand Eye Coordination in Rope Handling , 1985 .

[2]  Joseph K. Kearney,et al.  A Case Study of Flexible Object Manipulation , 1991, Int. J. Robotics Res..

[3]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[4]  Anand Rangarajan,et al.  A new algorithm for non-rigid point matching , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[5]  L. Kauffman An Introduction to Knot Theory , 2001 .

[6]  Jun Takamatsu,et al.  Knot planning from observation , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[7]  Anand Rangarajan,et al.  A new point matching algorithm for non-rigid registration , 2003, Comput. Vis. Image Underst..

[8]  Jürgen Schmidhuber,et al.  A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Mitul Saha,et al.  Motion planning for robotic manipulation of deformable linear objects , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[10]  Hidefumi Wakamatsu,et al.  Knotting/Unknotting Manipulation of Deformable Linear Objects , 2006, Int. J. Robotics Res..

[11]  Masatoshi Ishikawa,et al.  One-handed knotting of a flexible rope with a high-speed multifingered hand having tactile sensors , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[12]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[13]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[14]  Matthew P. Bell,et al.  Flexible Object Manipulation , 2010 .

[15]  Pieter Abbeel,et al.  Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding , 2010, 2010 IEEE International Conference on Robotics and Automation.

[16]  Trevor Darrell,et al.  Parametrized shape models for clothing , 2011, 2011 IEEE International Conference on Robotics and Automation.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  J. Schulman,et al.  Generalization in Robotic Manipulation Through The Use of Non-Rigid Registration , 2013 .

[19]  Tae-Kyun Kim,et al.  A syntactic approach to robot imitation learning using probabilistic activity grammars , 2013, Robotics Auton. Syst..

[20]  Ankush Gupta,et al.  A case study of trajectory transfer through non-rigid registration for a simplified suturing scenario , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Thomas B. Schön,et al.  From Pixels to Torques: Policy Learning with Deep Dynamical Models , 2015, ICML 2015.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Martin A. Riedmiller,et al.  Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images , 2015, NIPS.

[24]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[25]  Yi Li,et al.  Robot Learning Manipulation Action Plans by "Watching" Unconstrained Videos from the World Wide Web , 2015, AAAI.

[26]  Honglak Lee,et al.  Action-Conditional Video Prediction using Deep Networks in Atari Games , 2015, NIPS.

[27]  Ross A. Knepper,et al.  DeepMPC: Learning Deep Latent Features for Model Predictive Control , 2015, Robotics: Science and Systems.

[28]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[29]  Jitendra Malik,et al.  Learning Visual Predictive Models of Physics for Playing Billiards , 2015, ICLR.

[30]  Alexei A. Efros,et al.  What makes ImageNet good for transfer learning? , 2016, ArXiv.

[31]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[32]  Jitendra Malik,et al.  Learning to Poke by Poking: Experiential Learning of Intuitive Physics , 2016, NIPS.

[33]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[34]  Abhinav Gupta,et al.  The Curious Robot: Learning Visual Representations via Physical Interactions , 2016, ECCV.

[35]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..