Domain Randomization and Generative Models for Robotic Grasping

Deep learning-based robotic grasping has made significant progress thanks to algorithmic improvements and increased data availability. However, state-of-the-art models are often trained on as few as hundreds or thousands of unique object instances, and as a result generalization can be a challenge. In this work, we explore a novel data generation pipeline for training a deep neural network to perform grasp planning that applies the idea of domain randomization to object synthesis. We generate millions of unique, unrealistic procedurally generated objects, and train a deep neural network to perform grasp planning on these objects. Since the distribution of successful grasps for a given object can be highly multimodal, we propose an autoregressive grasp planning model that maps sensor inputs of a scene to a probability distribution over possible grasps. This model allows us to sample grasps efficiently at test time (or avoid sampling entirely). We evaluate our model architecture and data generation pipeline in simulation and the real world. We find we can achieve a >90% success rate on previously unseen realistic objects at test time in simulation despite having only been trained on random objects. We also demonstrate an 80% success rate on real-world grasp attempts despite having only been trained on random simulated objects.

[1]  Van-Duc Nguyen,et al.  Constructing force-closure grasps , 1986, Proceedings. 1986 IEEE International Conference on Robotics and Automation.

[2]  Richard M. Murray,et al.  A Mathematical Introduction to Robotic Manipulation , 1994 .

[3]  Nick Jakobi,et al.  Evolutionary Robotics and the Radical Envelope-of-Noise Hypothesis , 1997, Adapt. Behav..

[4]  Vijay Kumar,et al.  Robotic grasping and contact: a review , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[5]  Henrik I. Christensen,et al.  Automatic grasp planning using shape primitives , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[6]  Antonio Morales,et al.  Using Experience for Assessing Grasp Reliability , 2004, Int. J. Humanoid Robotics.

[7]  Peter K. Allen,et al.  An SVM learning approach to robotic grasping , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[8]  Peter K. Allen,et al.  Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[9]  Peter K. Allen,et al.  Grasp Planning via Decomposition Trees , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[10]  Manuel Lopes,et al.  Learning Object Affordances: From Sensory--Motor Coordination to Imitation , 2008, IEEE Transactions on Robotics.

[11]  Lawson L. S. Wong,et al.  Learning Grasp Strategies with Partial Shape Information , 2008, AAAI.

[12]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[13]  Matei T. Ciocarlie,et al.  Towards Reliable Grasping and Manipulation in Household Environments , 2010, ISER.

[14]  Quoc V. Le,et al.  Learning to grasp objects with multiple contact points , 2010, 2010 IEEE International Conference on Robotics and Automation.

[15]  Hugo Larochelle,et al.  The Neural Autoregressive Distribution Estimator , 2011, AISTATS.

[16]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[17]  Pieter Abbeel,et al.  Grasping and Fixturing as Submodular Coverage Problems , 2011, ISRR.

[18]  Peter K. Allen,et al.  Pose error robust grasping from contact wrench space metrics , 2012, 2012 IEEE International Conference on Robotics and Automation.

[19]  Alberto Rodriguez,et al.  From caging to grasping , 2011, Int. J. Robotics Res..

[20]  Anis Sahbani,et al.  An overview of 3D object grasp synthesis algorithms , 2012, Robotics Auton. Syst..

[21]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[22]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[23]  Antonio Bicchi,et al.  On the synthesis of feasible and prehensile robotic grasps , 2012, 2012 IEEE International Conference on Robotics and Automation.

[24]  Hugo Larochelle,et al.  A Neural Autoregressive Topic Model , 2012, NIPS.

[25]  Antonio Bicchi,et al.  On the manipulability ellipsoids of underactuated robotic hands with compliance , 2012, Robotics Auton. Syst..

[26]  Kenneth Y. Goldberg,et al.  Cloud-based robot grasping with the google object recognition engine , 2013, 2013 IEEE International Conference on Robotics and Automation.

[27]  Pieter Abbeel,et al.  Multimodal blending for high-accuracy instance recognition , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[29]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[30]  Daan Wierstra,et al.  Deep AutoRegressive Networks , 2013, ICML.

[31]  Emanuel Todorov,et al.  Ensemble-CIO: Full-body dynamic motion planning that transfers to physical humanoids , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[33]  Jitendra Malik,et al.  Aligning 3D models to RGB-D images of cluttered scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Hugo Larochelle,et al.  MADE: Masked Autoencoder for Distribution Estimation , 2015, ICML.

[35]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[38]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[39]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[40]  Stefan Leutenegger,et al.  Deep learning a grasp function for grasping under gripper pose uncertainty , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[41]  Heiga Zen,et al.  WaveNet: A Generative Model for Raw Audio , 2016, SSW.

[42]  Kate Saenko,et al.  Learning a visuomotor controller for real world robotic grasping using simulated depth images , 2017, CoRL.

[43]  Kuan-Ting Yu,et al.  Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[44]  Anders Grunnet-Jepsen,et al.  Intel RealSense Stereoscopic Depth Cameras , 2017, CVPR 2017.

[45]  Kate Saenko,et al.  Learning a visuomotor controller for real world robotic grasping using easily simulated depth images , 2017, ArXiv.

[46]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[47]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[48]  Arkanath Pathak,et al.  Learning Grasping Interaction with Geometry-aware 3D Representations , 2017, ArXiv.

[49]  Peter I. Corke,et al.  Sim-to-real Transfer of Visuo-motor Policies for Reaching in Clutter: Domain Randomization and Adaptation with Modular Networks , 2017, ArXiv.

[50]  James Davidson,et al.  Supervision via competition: Robot adversaries for learning tasks , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[51]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[52]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[53]  Arkanath Pathak,et al.  Learning 6-DOF Grasping Interaction via Deep Geometry-Aware 3D Representations , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[54]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[55]  Xinyu Liu,et al.  Dex-Net 3.0: Computing Robust Robot Suction Grasp Targets in Point Clouds using a New Analytic Model and Deep Learning , 2017, ArXiv.

[56]  Michael Milford,et al.  Adversarial discriminative sim-to-real transfer of visuo-motor policies , 2017, Int. J. Robotics Res..