Robotic picking in dense clutter via domain invariant learning from synthetic dense cluttered rendering

Abstract Robotic picking of diverse range of novel objects is a great challenge in dense clutter, in which objects are stacked together tightly. However, collecting large-scale dataset with dense grasp labels is extremely time-consuming, and there is huge gap between synthetic color and depth images with real images. In this paper, we explore suction based grasping from synthetic dense cluttered rendering. To avoid tedious human labeling, we present a pipeline to model stacked objects in simulation and generate photorealistic rendering RGB-D images with dense suction point labels. To reduce simulation-to-reality gap from synthetic images to low-quality RGB-D camera, we propose a novel domain-invariant Suction Quality Neural Network (diSQNN) by training on labeled synthetic dataset and unlabeled real dataset. Specifically, we propose to fuse realistic color feature and adversarial depth feature with a domain discriminator on depth extractor. We evaluate our proposed method by comparing with other baseline and suction detection method. The results demonstrate the effectiveness of our synthetic dense cluttered rendering, and our proposed diSQNN can maintain high transfer performance on real images. On a physical robot with vacuum-based gripper, the proposed method achieves average picking success rate of 91% and 88% for known objects and novel objects in a tote without using any manual labels.

[1]  Oliver Brock,et al.  Analysis and Observations From the First Amazon Picking Challenge , 2016, IEEE Transactions on Automation Science and Engineering.

[2]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[3]  Joseph Redmon,et al.  Real-time grasp detection using convolutional neural networks , 2014, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[4]  Mathieu Aubry,et al.  Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[5]  Xinyu Li,et al.  A novel robotic grasp detection method based on region proposal networks , 2020, Robotics Comput. Integr. Manuf..

[6]  Jakub W. Pachocki,et al.  Learning dexterous in-hand manipulation , 2018, Int. J. Robotics Res..

[7]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[8]  Masahiro Fujita,et al.  What are the important technologies for bin picking? Technology analysis of robots in competitions based on a set of performance metrics , 2019, Adv. Robotics.

[9]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Christopher Kanan,et al.  Robotic grasp detection using deep convolutional neural networks , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Kuan-Ting Yu,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2019, The International Journal of Robotics Research.

[12]  Kate Saenko,et al.  High precision grasp pose detection in dense clutter , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[13]  Nanning Zheng,et al.  A Real-time Robotic Grasp Approach with Oriented Anchor Box , 2018, ArXiv.

[14]  Patricio A. Vela,et al.  Real-World Multiobject, Multigrasp Detection , 2018, IEEE Robotics and Automation Letters.

[15]  Jin Ma,et al.  Deep learning for picking point detection in dense cluster , 2017, 2017 11th Asian Control Conference (ASCC).

[16]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Ken Goldberg,et al.  Learning ambidextrous robot grasping policies , 2019, Science Robotics.

[18]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[19]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[20]  Sergey Levine,et al.  Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection , 2016, Int. J. Robotics Res..

[21]  Rama Chellappa,et al.  Fast object localization and pose estimation in heavy clutter for robotic bin picking , 2012, Int. J. Robotics Res..

[22]  Ken Goldberg,et al.  Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Data , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[23]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[24]  Mrinal Kalakrishnan,et al.  Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Sergey Levine,et al.  Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Martijn Wisse,et al.  Integrating Different Levels of Automation: Lessons From Winning the Amazon Robotics Challenge 2016 , 2018, IEEE Transactions on Industrial Informatics.

[27]  Oliver Brock,et al.  Four aspects of building robotic systems: lessons from the Amazon Picking Challenge 2015 , 2018, Auton. Robots.

[28]  Kate Saenko,et al.  Grasp Pose Detection in Point Clouds , 2017, Int. J. Robotics Res..

[29]  Sven Behnke,et al.  RGB-D object detection and semantic segmentation for autonomous manipulation in clutter , 2018, Int. J. Robotics Res..

[30]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[31]  Raymond Y. K. Lau,et al.  Least Squares Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).