Domain adversarial transfer for cross-domain and task-constrained grasp pose detection

Abstract Transferring the grasping skills learned from simulated environments to the real world is favorable for many robotic applications, in which the collecting and labeling processes of real-world visual grasping datasets are often expensive or even impractical. However, the models purely trained on simulated data are often difficult to generalize well to the unseen real world due to the domain gap between the training and testing data. In this paper, we propose a novel domain adversarial transfer network to narrow the domain gap for cross-domain and task-constrained grasp pose detection. Generative adversarial training is exploited to constrain the generator to produce simulation-like data for extracting the shared features with the joint distribution. We also propose to improve the backbone by extracting task-constrained grasp candidates and constructing the grasp candidate evaluator with a lightweight structure and an embedded recalibration technique. To validate the effectiveness and superiority of our proposed method, grasping performance evaluation and task-oriented human–robot interaction experiments were investigated. The experiment results indicate that the proposed method achieves state-of-the-art performance in these experimental settings. An average task-constrained grasping success rate of 83.3% without using any real-world labels for the task-oriented human–robot interaction experiment was achieved especially.

[1]  Henryk Michalewski,et al.  Simulation-Based Reinforcement Learning for Real-World Autonomous Driving , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[2]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[3]  Nikolaos G. Tsagarakis,et al.  Object-based affordances detection with Convolutional Neural Networks and dense Conditional Random Fields , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[4]  Darwin G. Caldwell,et al.  AffordanceNet: An End-to-End Deep Learning Approach for Object Affordance Detection , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Koby Crammer,et al.  Analysis of Representations for Domain Adaptation , 2006, NIPS.

[6]  C. Karen Liu,et al.  Policy Transfer via Kinematic Domain Randomization and Adaptation , 2020, 2021 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Eren Erdal Aksoy,et al.  Part-based grasp planning for familiar objects , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[8]  Danica Kragic,et al.  Affordance detection for task-specific grasping using deep learning , 2017, 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids).

[9]  Van-Duc Nguyen,et al.  Constructing Force- Closure Grasps , 1988, Int. J. Robotics Res..

[10]  Kate Saenko,et al.  Grasp Pose Detection in Point Clouds , 2017, Int. J. Robotics Res..

[11]  Peter K. Allen,et al.  Data-driven grasping , 2011, Auton. Robots.

[12]  Ville Kyrki,et al.  Meta Reinforcement Learning for Sim-to-real Domain Adaptation , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[13]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[14]  Masayuki Inaba,et al.  Predicting Part Affordances of Objects Using Two-Stream Fully Convolutional Network with Multimodal Inputs , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[15]  Abien Fred Agarap Deep Learning using Rectified Linear Units (ReLU) , 2018, ArXiv.

[16]  Cewu Lu,et al.  GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Fuchun Sun,et al.  PointNetGPD: Detecting Grasp Configurations from Point Sets , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[18]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[19]  Pieter Abbeel,et al.  Domain Randomization for Active Pose Estimation , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[20]  Peter Henderson,et al.  An Introduction to Deep Reinforcement Learning , 2018, Found. Trends Mach. Learn..

[21]  Michael Milford,et al.  Modular Deep Q Networks for Sim-to-real Transfer of Visuo-motor Policies , 2016, ICRA 2017.

[22]  Fumio Kanehiro,et al.  Planning Grasps With Suction Cups and Parallel Grippers Using Superimposed Segmentation of Object Meshes , 2020, IEEE Transactions on Robotics.

[23]  Robert Platt,et al.  Using Geometry to Detect Grasp Poses in 3D Point Clouds , 2015, ISRR.

[24]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[25]  Carlos D. Castillo,et al.  Generate to Adapt: Aligning Domains Using Generative Adversarial Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Jing Liu,et al.  Reluplex made more practical: Leaky ReLU , 2020, 2020 IEEE Symposium on Computers and Communications (ISCC).

[27]  Mengjie Zhang,et al.  Domain Generalization for Object Recognition with Multi-task Autoencoders , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Matei T. Ciocarlie,et al.  The Columbia grasp database , 2009, 2009 IEEE International Conference on Robotics and Automation.

[29]  Trevor Darrell,et al.  Deep Domain Confusion: Maximizing for Domain Invariance , 2014, CVPR 2014.

[30]  Yuan Shi,et al.  Geodesic flow kernel for unsupervised domain adaptation , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Seungwon Choi,et al.  Fast and Safe Policy Adaptation via Alignment-based Transfer , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[32]  Dong Xu,et al.  Collaborative and Adversarial Network for Unsupervised Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Cordelia Schmid,et al.  Learning to Augment Synthetic Images for Sim2Real Policy Transfer , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[34]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Marcin Andrychowicz,et al.  Sim-to-Real Transfer of Robotic Control with Dynamics Randomization , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[36]  Detecting Robotic Affordances on Novel Objects with Regional Attention and Attributes , 2019, ArXiv.

[37]  Ian Taylor,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Justus H. Piater,et al.  Towards affordance detection for robot manipulation using affordance for parts and parts for affordance , 2018, Auton. Robots.

[39]  Michael I. Jordan,et al.  Unsupervised Domain Adaptation with Residual Transfer Networks , 2016, NIPS.

[40]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[41]  Mengjie Zhang,et al.  Deep Reconstruction-Classification Networks for Unsupervised Domain Adaptation , 2016, ECCV.

[42]  Yevgen Chebotar,et al.  Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[43]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[44]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[45]  Kate Saenko,et al.  High precision grasp pose detection in dense clutter , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[46]  Fumio Kanehiro,et al.  Preparatory Manipulation Planning Using Automatically Determined Single and Dual Arm , 2018, IEEE Transactions on Industrial Informatics.

[47]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[48]  Sergey Levine,et al.  Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[50]  Yanhui Duan,et al.  Grasp Pose Detection with Affordance-based Task Constraint Learning in Single-view Point Clouds , 2020, J. Intell. Robotic Syst..

[51]  Nikolaos G. Tsagarakis,et al.  Detecting object affordances with Convolutional Neural Networks , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[52]  Dumitru Erhan,et al.  Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Dieter Fox,et al.  Unsupervised Feature Learning for RGB-D Based Object Recognition , 2012, ISER.

[54]  Wolfram Burgard,et al.  VR-Goggles for Robots: Real-to-Sim Domain Adaptation for Visual Control , 2018, IEEE Robotics and Automation Letters.

[55]  Henrik Gordon Petersen,et al.  On transferability and contexts when using simulated grasp databases , 2014, Robotica.

[56]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  Mrinal Kalakrishnan,et al.  Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[58]  Jonathon Shlens,et al.  Conditional Image Synthesis with Auxiliary Classifier GANs , 2016, ICML.

[59]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[60]  Fuchun Sun,et al.  Active Affordance Exploration for Robot Grasping , 2019, ICIRA.

[61]  Yiannis Aloimonos,et al.  Affordance detection of tool parts from geometric features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[62]  Pieter Abbeel,et al.  BigBIRD: A large-scale 3D database of object instances , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[63]  Christopher Amato,et al.  Online Planning for Target Object Search in Clutter under Partial Observability , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[64]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[65]  Trevor Darrell,et al.  Adversarial Discriminative Domain Adaptation , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Uwe D. Hanebeck,et al.  Affordance-Based Grasping and Manipulation in Real World Applications , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[67]  Markus Vincze,et al.  3DNet: Large-scale object class recognition from CAD models , 2012, 2012 IEEE International Conference on Robotics and Automation.

[68]  Robert Platt,et al.  Localizing Handle-Like Grasp Affordances in 3D Point Clouds , 2014, ISER.