Domain randomization for transferring deep neural networks from simulation to the real world

Bridging the ‘reality gap’ that separates simulated robotics from experiments on hardware could accelerate robotic research through improved data availability. This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator. With enough variability in the simulator, the real world may appear to the model as just another variation. We focus on the task of object localization, which is a stepping stone to general robotic manipulation skills. We find that it is possible to train a real-world object detector that is accurate to 1.5 cm and robust to distractors and partial occlusions using only data from a simulator with non-realistic random textures. To demonstrate the capabilities of our detectors, we show they can be used to perform grasping in a cluttered environment. To our knowledge, this is the first successful transfer of a deep neural network trained only on simulated RGB images (without pre-training on real images) to the real world for the purpose of robotic control.

[1]  Ramakant Nevatia,et al.  Description and Recognition of Curved Objects , 1977, Artif. Intell..

[2]  David G. Lowe,et al.  Three-Dimensional Object Recognition from Single Two-Dimensional Images , 1987, Artif. Intell..

[3]  Gerd Hirzinger,et al.  Real-time visual tracking of 3D objects with dynamic handling of occlusion , 1997, Proceedings of International Conference on Robotics and Automation.

[4]  Nick Jakobi,et al.  Evolutionary Robotics and the Radical Envelope-of-Noise Hypothesis , 1997, Adapt. Behav..

[5]  Danica Kragic,et al.  Object recognition and pose estimation using color cooccurrence histograms and geometric modeling , 2005, Image Vis. Comput..

[6]  Pieter Abbeel,et al.  Using inaccurate models in reinforcement learning , 2006, ICML.

[7]  David G. Lowe,et al.  What and Where: 3D Object Recognition with Accurate Pose , 2006, Toward Category-Level Object Recognition.

[8]  Manuela M. Veloso,et al.  Detection and Localization of Multiple Objects , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[9]  Rong Yan,et al.  Cross-domain video concept detection using adaptive svms , 2007, ACM Multimedia.

[10]  Andrew Y. Ng,et al.  Learning omnidirectional path following using dimensionality reduction , 2007, Robotics: Science and Systems.

[11]  C. Stachniss,et al.  Learning Omnidirectional Path Following Using Dimensionality Reduction , 2008 .

[12]  Siddhartha S. Srinivasa,et al.  Object recognition and full pose registration from a single image for robotic manipulation , 2009, 2009 IEEE International Conference on Robotics and Automation.

[13]  Yishay Mansour,et al.  Domain Adaptation: Learning Bounds and Algorithms , 2009, COLT.

[14]  Pieter Abbeel,et al.  Superhuman performance of surgical tasks by robots using iterative learning from human-guided demonstrations , 2010, 2010 IEEE International Conference on Robotics and Automation.

[15]  Siddhartha S. Srinivasa,et al.  Efficient multi-view object recognition and full pose estimation , 2010, 2010 IEEE International Conference on Robotics and Automation.

[16]  Jan Peters,et al.  Model learning for robot control: a survey , 2011, Cognitive Processing.

[17]  Siddhartha S. Srinivasa,et al.  The MOPED framework: Object recognition and pose estimation for manipulation , 2011, Int. J. Robotics Res..

[18]  Trevor Darrell,et al.  What you saw is not what you get: Domain adaptation using asymmetric kernel transforms , 2011, CVPR 2011.

[19]  Pieter Abbeel,et al.  A textured object recognition pipeline for color and depth image data , 2012, 2012 IEEE International Conference on Robotics and Automation.

[20]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Ivor W. Tsang,et al.  Learning with Augmented Features for Heterogeneous Domain Adaptation , 2012, ICML.

[22]  Jürgen Leitner,et al.  Artificial neural networks for spatial perception: Towards visual object localisation in humanoid robots , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[23]  Trevor Darrell,et al.  Efficient Learning of Domain-invariant Image Representations , 2013, ICLR.

[24]  Stéphane Doncieux,et al.  The Transferability Approach: Crossing the Reality Gap in Evolutionary Robotics , 2013, IEEE Transactions on Evolutionary Computation.

[25]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[26]  Iasonas Kokkinos,et al.  Describing Textures in the Wild , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[28]  Kate Saenko,et al.  From Virtual to Reality: Fast Adaptation of Virtual Object Detectors to Real Domains , 2014, BMVC.

[29]  Trevor Darrell,et al.  LSDA: Large Scale Detection through Adaptation , 2014, NIPS.

[30]  Jonathan P. How,et al.  Reinforcement learning with multi-fidelity simulators , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[31]  Emanuel Todorov,et al.  Ensemble-CIO: Full-body dynamic motion planning that transfers to physical humanoids , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Antoine Cully,et al.  Robots that can adapt like animals , 2014, Nature.

[33]  Sergey Levine,et al.  Trust Region Policy Optimization , 2015, ICML.

[34]  Leonidas J. Guibas,et al.  Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[36]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[37]  Jonathan P. How,et al.  Efficient reinforcement learning for robots using informative simulated priors , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Kate Saenko,et al.  Learning Deep Object Detectors from 3D Models , 2014, 2015 IEEE International Conference on Computer Vision (ICCV).

[39]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[40]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[41]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[42]  Peter I. Corke,et al.  Vision-Based Reaching Using Modular Deep Networks: from Simulation to the Real World , 2016, ArXiv.

[43]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[44]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[45]  Sergey Levine,et al.  Adapting Deep Visuomotor Representations with Weak Pairwise Constraints , 2015, WAFR.

[46]  Daniel King,et al.  Fetch & Freight : Standard Platforms for Service Robot Applications , 2016 .

[47]  Stephen James,et al.  3D Simulation for Robot Arm Control with Deep Q-Learning , 2016, ArXiv.

[48]  Takeo Kanade,et al.  How Useful Is Photo-Realistic Rendering for Visual Learning? , 2016, ECCV Workshops.

[49]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[50]  Lior Wolf,et al.  Unsupervised Cross-Domain Image Generation , 2016, ICLR.

[51]  Jiaying Liu,et al.  Revisiting Batch Normalization For Practical Domain Adaptation , 2016, ICLR.

[52]  Razvan Pascanu,et al.  Sim-to-Real Robot Learning from Pixels with Progressive Nets , 2016, CoRL.

[53]  Greg Turk,et al.  Preparing for the Unknown: Learning a Universal Policy with Online System Identification , 2017, Robotics: Science and Systems.

[54]  Balaraman Ravindran,et al.  EPOpt: Learning Robust Neural Network Policies Using Model Ensembles , 2016, ICLR.

[55]  Sergey Levine,et al.  Learning Invariant Feature Spaces to Transfer Skills with Reinforcement Learning , 2017, ICLR.

[56]  Danica Kragic,et al.  Reinforcement Learning for Pivoting Task , 2017, ArXiv.

[57]  Kostas E. Bekris,et al.  A self-supervised learning system for object detection using physics simulation and multi-view pose estimation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[58]  Danica Kragic,et al.  Deep predictive policy training using reinforcement learning , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[59]  Ziyan Wu,et al.  DepthSynth: Real-Time Realistic Synthetic Data Generation from CAD Models for 2.5D Recognition , 2017, 2017 International Conference on 3D Vision (3DV).

[60]  M. Levine Empagliflozin for Type 2 Diabetes Mellitus: An Overview of Phase 3 Clinical Trials , 2017, Current diabetes reviews.

[61]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.