Learning Rope Manipulation Policies Using Dense Object Descriptors Trained on Synthetic Depth Data

Robotic manipulation of deformable 1D objects such as ropes, cables, and hoses is challenging due to the lack of high-fidelity analytic models and large configuration spaces. Furthermore, learning end-to-end manipulation policies directly from images and physical interaction requires significant time on a robot and can fail to generalize across tasks. We address these challenges using interpretable deep visual representations for rope, extending recent work on dense object descriptors for robot manipulation. This facilitates the design of interpretable and transferable geometric policies built on top of the learned representations, decoupling visual reasoning and control. We present an approach that learns point-pair correspondences between initial and goal rope configurations, which implicitly encodes geometric structure, entirely in simulation from synthetic depth images. We demonstrate that the learned representation — dense depth object descriptors (DDODs) — can be used to manipulate a real rope into a variety of different arrangements either by learning from demonstrations or using interpretable geometric policies. In 50 trials of a knot-tying task with the ABB YuMi Robot, the system achieves a 66% knot-tying success rate from previously unseen configurations. See https://tinyurl.com/rope-learning for supplementary material and videos.

[1]  Brijen Thananjeyan,et al.  Multilateral surgical pattern cutting in 2D orthotropic gauze with deep reinforcement learning policies for tensioning , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Jitendra Malik,et al.  Combining self-supervised learning and imitation for vision-based rope manipulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[3]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[4]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[5]  Lei Zhang,et al.  Canny edge detection enhancement by scale multiplication , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ankush Gupta,et al.  A case study of trajectory transfer through non-rigid registration for a simplified suturing scenario , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Jitendra Malik,et al.  Zero-Shot Visual Imitation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[8]  Brijen Thananjeyan,et al.  Safety Augmented Value Estimation From Demonstrations (SAVED): Safe Deep Model-Based RL for Sparse Cost Robotic Tasks , 2020, IEEE Robotics and Automation Letters.

[9]  Jürgen Schmidhuber,et al.  A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Pieter Abbeel,et al.  Learning Robotic Manipulation through Visual Planning and Acting , 2019, Robotics: Science and Systems.

[11]  Hidefumi Wakamatsu,et al.  Knotting/Unknotting Manipulation of Deformable Linear Objects , 2006, Int. J. Robotics Res..

[12]  Pieter Abbeel,et al.  Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations , 2010, 2010 IEEE International Conference on Robotics and Automation.

[13]  Roland Hess,et al.  The Essential Blender: Guide to 3D Creation with the Open Source Suite Blender , 2007 .

[14]  Larry H. Matthies,et al.  Task-oriented grasping with semantic and geometric scene understanding , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[16]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[17]  Pieter Abbeel,et al.  Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding , 2010, 2010 IEEE International Conference on Robotics and Automation.

[18]  Siddhartha S. Srinivasa,et al.  Object recognition and full pose registration from a single image for robotic manipulation , 2009, 2009 IEEE International Conference on Robotics and Automation.

[19]  Yan-Bin Jia,et al.  Grasping deformable planar objects: Squeeze, stick/slip analysis, and energy-based optimalities , 2014, Int. J. Robotics Res..

[20]  Mathieu Aubry,et al.  Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[22]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[23]  Dieter Fox,et al.  DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Dieter Fox,et al.  Self-Supervised Visual Descriptor Learning for Dense Correspondence , 2017, IEEE Robotics and Automation Letters.

[25]  Vladlen Koltun,et al.  Learning Compact Geometric Features , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[26]  Kenneth Y. Goldberg,et al.  Linear Push Policies to Increase Grasp Access for Robot Bin Picking , 2018, 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE).

[27]  L. Kauffman An Introduction to Knot Theory , 2001 .

[28]  Yan-Bin Jia,et al.  Picking up a soft 3D object by “feeling” the grip , 2015, Int. J. Robotics Res..

[29]  Joseph K. Kearney,et al.  A Case Study of Flexible Object Manipulation , 1991, Int. J. Robotics Res..

[30]  Donald G. Dansereau,et al.  Interactive computational imaging for deformable object analysis , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Xinyu Liu,et al.  Dex-Net 3.0: Computing Robust Robot Suction Grasp Targets in Point Clouds using a New Analytic Model and Deep Learning , 2017, ArXiv.

[32]  Qingming Huang,et al.  Online Deformable Object Tracking Based on Structure-Aware Hyper-Graph , 2016, IEEE Transactions on Image Processing.

[33]  Pieter Abbeel,et al.  Learning from Demonstrations Through the Use of Non-rigid Registration , 2013, ISRR.

[34]  Sergey Levine,et al.  Grasp2Vec: Learning Object Representations from Self-Supervised Grasping , 2018, CoRL.

[35]  Jun Takamatsu,et al.  Knot planning from observation , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[36]  Ken Perlin,et al.  Improving noise , 2002, SIGGRAPH.

[37]  Chen Qian,et al.  Realtime and Robust Hand Tracking from Depth , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Russ Tedrake,et al.  Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation , 2018, CoRL.

[39]  YingLiang Ma,et al.  Point inversion and projection for NURBS curve and surface: Control polygon approach , 2003, Comput. Aided Geom. Des..