Conditional Patch-Based Domain Randomization: Improving Texture Domain Randomization Using Natural Image Patches

Using Domain Randomized synthetic data for training deep learning systems is a promising approach for addressing the data and the labeling requirements for supervised techniques to bridge the gap between simulation and the real world. We propose a novel approach for generating and applying class-specific Domain Randomization textures by using randomly cropped image patches from real-world data. In evaluation against the current Domain Randomization texture application techniques, our approach outperforms the highest performing technique by 4.94 AP and 6.71 AP when solving object detection and semantic segmentation tasks on the YCB-M [1] real-world robotics dataset. Our approach is a fast and inexpensive way of generating Domain Randomized textures while avoiding the need to handcraft texture distributions currently being used.

[1]  Ales Leonardis,et al.  Quantifying the Use of Domain Randomization , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).

[2]  Tero Karras,et al.  Training Generative Adversarial Networks with Limited Data , 2020, NeurIPS.

[3]  Joachim Hertzberg,et al.  YCB-M: A Multi-Camera RGB-D Dataset for Object Recognition and 6DoF Pose Estimation , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Trevor Darrell,et al.  BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning , 2018, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[6]  Samira Pouyanfar,et al.  ROADS: Randomization for Obstacle Avoidance and Driving in Simulation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[7]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[8]  Dieter Fox,et al.  Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects , 2018, CoRL.

[9]  Alexandre Bernardino,et al.  Applying Domain Randomization to Synthetic Data for Object Category Detection , 2018, ArXiv.

[10]  Varun Jampani,et al.  Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[11]  Dieter Fox,et al.  PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes , 2017, Robotics: Science and Systems.

[12]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[13]  Marcin Andrychowicz,et al.  Asymmetric Actor Critic for Image-Based Robot Learning , 2017, Robotics: Science and Systems.

[14]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Stephen Tyree,et al.  Sim-to-Real Transfer of Accurate Grasping with Eye-In-Hand Observations and Continuous Control , 2017, ArXiv.

[16]  Martial Hebert,et al.  Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Andrew J. Davison,et al.  Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task , 2017, CoRL.

[18]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[19]  P. Abbeel,et al.  Yale-CMU-Berkeley dataset for robotic manipulation research , 2017, Int. J. Robotics Res..

[20]  Jana Kosecka,et al.  A dataset for developing and benchmarking active vision , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[22]  Leonidas J. Guibas,et al.  Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[24]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[25]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.