Learning to Manipulate Deformable Objects without Demonstrations

In this paper we tackle the problem of deformable object manipulation through model-free visual reinforcement learning (RL). In order to circumvent the sample inefficiency of RL, we propose two key ideas that accelerate learning. First, we propose an iterative pick-place action space that encodes the conditional relationship between picking and placing on deformable objects. The explicit structural encoding enables faster learning under complex object dynamics. Second, instead of jointly learning both the pick and the place locations, we only explicitly learn the placing policy conditioned on random pick points. Then, by selecting the pick point that has Maximal Value under Placing (MVP), we obtain our picking policy. This provides us with an informed picking policy during testing, while using only random pick points during training. Experimentally, this learning framework obtains an order of magnitude faster learning compared to independent action-spaces on our suite of deformable object manipulation tasks with visual RGB observations. Finally, using domain randomization, we transfer our policies to a real PR2 robot for challenging cloth and rope coverage tasks, and demonstrate significant improvements over standard RL techniques on average coverage.

[1]  Ken Goldberg,et al.  Deep Imitation Learning of Sequential Fabric Smoothing From an Algorithmic Supervisor , 2019, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[3]  Ken Goldberg,et al.  Deep Imitation Learning of Sequential Fabric Smoothing Policies , 2019, ArXiv.

[4]  Pieter Abbeel,et al.  rlpyt: A Research Code Base for Deep Reinforcement Learning in PyTorch , 2019, ArXiv.

[5]  Pieter Abbeel,et al.  Learning Robotic Manipulation through Visual Planning and Acting , 2019, Robotics: Science and Systems.

[6]  Soshi Iba,et al.  Deep Transfer Learning of Pick Points on Fabric for Robot Bed-Making , 2018, ISRR.

[7]  Henry Zhu,et al.  Soft Actor-Critic Algorithms and Applications , 2018, ArXiv.

[8]  John F. Canny,et al.  Robot Bed-Making: Deep Transfer Learning Using Depth Sensing of Deformable Fabric , 2018, ArXiv.

[9]  Abhinav Gupta,et al.  Robot Learning in Homes: Improving Generalization and Reducing Dataset Bias , 2018, NeurIPS.

[10]  Dinesh Manocha,et al.  Learning-based Feedback Controller for Deformable Object Manipulation , 2018, ArXiv.

[11]  Andrew J. Davison,et al.  Sim-to-Real Reinforcement Learning for Deformable Object Manipulation , 2018, CoRL.

[12]  Dmitry Berenson,et al.  Estimating Model Utility for Deformable Object Manipulation Using Multiarmed Bandit Methods , 2018, IEEE Transactions on Automation Science and Engineering.

[13]  Herke van Hoof,et al.  Addressing Function Approximation Error in Actor-Critic Methods , 2018, ICML.

[14]  Pieter Abbeel,et al.  Model-Ensemble Trust-Region Policy Optimization , 2018, ICLR.

[15]  Sergey Levine,et al.  Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor , 2018, ICML.

[16]  Yuval Tassa,et al.  DeepMind Control Suite , 2018, ArXiv.

[17]  Marcin Andrychowicz,et al.  Asymmetric Actor Critic for Image-Based Robot Learning , 2017, Robotics: Science and Systems.

[18]  Wojciech Zaremba,et al.  Domain Randomization and Generative Models for Robotic Grasping , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  Sergey Levine,et al.  Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations , 2017, Robotics: Science and Systems.

[20]  Jia Pan,et al.  Three-Dimensional Deformable Object Manipulation Using Fast Online Gaussian Process Regression , 2017, IEEE Robotics and Automation Letters.

[21]  Abhinav Gupta,et al.  CASSL: Curriculum Accelerated Self-Supervised Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[23]  Jitendra Malik,et al.  Combining self-supervised learning and imitation for vision-based rope manipulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[24]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[25]  Sergey Levine,et al.  Deep visual foresight for planning robot motion , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Dmitry Berenson,et al.  Interleaving Planning and Control for Deformable Object Manipulation , 2017, ISRR.

[27]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[28]  Wojciech Zaremba,et al.  OpenAI Gym , 2016, ArXiv.

[29]  Mathieu Aubry,et al.  Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[30]  Pieter Abbeel,et al.  Benchmarking Deep Reinforcement Learning for Continuous Control , 2016, ICML.

[31]  Alex Graves,et al.  Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[32]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[33]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[34]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[35]  Sergey Levine,et al.  Learning from multiple demonstrations using trajectory-aware non-rigid registration with applications to deformable object manipulation , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[36]  Twan Koolen,et al.  Team IHMC's Lessons Learned from the DARPA Robotics Challenge Trials , 2015, J. Field Robotics.

[37]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[38]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[39]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[40]  Vladimír Petrík,et al.  Garment perception and its folding using a dual-arm robot , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[41]  Yunhui Liu,et al.  On the visual deformation servoing of compliant objects: Uncalibrated control methods and experiments , 2014, Int. J. Robotics Res..

[42]  Dmitry Berenson,et al.  Manipulation of deformable objects without modeling and simulating deformation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[43]  Ankush Gupta,et al.  A case study of trajectory transfer through non-rigid registration for a simplified suturing scenario , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[44]  Pieter Abbeel,et al.  Tracking deformable objects with point clouds , 2013, 2013 IEEE International Conference on Robotics and Automation.

[45]  J. Schulman,et al.  Generalization in Robotic Manipulation Through The Use of Non-Rigid Registration , 2013 .

[46]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[47]  P. Jiménez,et al.  Survey on model-based manipulation planning of deformable objects , 2012 .

[48]  Belhassen Chedli Bouzgarrou,et al.  Soft Material Modeling for Robotic Manipulation , 2012 .

[49]  Sachin Chitta,et al.  MoveIt! [ROS Topics] , 2012, IEEE Robotics Autom. Mag..

[50]  Wolfram Burgard,et al.  Efficient motion planning for manipulation robots in environments with deformable objects , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[51]  Kaspar Althoefer,et al.  Tactile sensing for dexterous in-hand manipulation in robotics-A review , 2011 .

[52]  Eduardo F. Morales,et al.  An Introduction to Reinforcement Learning , 2011 .

[53]  Pieter Abbeel,et al.  Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding , 2010, 2010 IEEE International Conference on Robotics and Automation.

[54]  Pierre Payeur,et al.  Dexterous Robotic Manipulation of Deformable Objects with Multi-Sensory Feedback - a Review , 2010 .

[55]  Alexandru Patriciu,et al.  Deformation Planning for Robotic Soft Tissue Manipulation , 2009, 2009 Second International Conferences on Advances in Computer-Human Interactions.

[56]  Mitul Saha,et al.  Manipulation Planning for Deformable Linear Objects , 2007, IEEE Transactions on Robotics.

[57]  Jürgen Schmidhuber,et al.  A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[58]  Lydia E. Kavraki,et al.  Path planning for deformable linear objects , 2006, IEEE Transactions on Robotics.

[59]  Nancy M. Amato,et al.  An obstacle-based rapidly-exploring random tree , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..

[60]  Hidefumi Wakamatsu,et al.  Knotting/Unknotting Manipulation of Deformable Linear Objects , 2006, Int. J. Robotics Res..

[61]  Ronald J. Williams,et al.  Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.

[62]  Shinichi Hirai,et al.  Robust manipulation of deformable objects by a simple PID feedback , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[63]  P. Pierański,et al.  Tight open knots , 2001, physics/0103016.

[64]  Sham M. Kakade,et al.  A Natural Policy Gradient , 2001, NIPS.

[65]  Heinz Wörn,et al.  Robot Manipulation of Deformable Objects: Advanced Manufacturing , 2000 .

[66]  George A. Bekey,et al.  Intelligent Learning for Deformable Object Manipulation , 1999, Proceedings 1999 IEEE International Symposium on Computational Intelligence in Robotics and Automation. CIRA'99 (Cat. No.99EX375).

[67]  Richard S. Sutton,et al.  Introduction to Reinforcement Learning , 1998 .

[68]  Karun B. Shimoga,et al.  Robot Grasp Synthesis Algorithms: A Survey , 1996, Int. J. Robotics Res..

[69]  Andrew W. Moore,et al.  Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..

[70]  Bernice E. Rogowitz,et al.  A rule-based tool for assisting colormap selection , 1995, Proceedings Visualization '95.

[71]  Tomás Lozano-Pérez,et al.  Task-level planning of pick-and-place robot motions , 1989, Computer.

[72]  R. Brooks Planning Collision- Free Motions for Pick-and-Place Operations , 1983 .