论文信息 - An Initialization Method of Deep Q-network for Learning Acceleration of Robotic Grasp

An Initialization Method of Deep Q-network for Learning Acceleration of Robotic Grasp

Generally, self-supervised learning of robotic grasp utilizes a model-free Reinforcement Learning method, e.g., a Deep Q-network (DQN). A DQN makes use of a high-dimensional Q-network to infer dense pixel-wise probability maps of affordances for grasping actions. Unfortunately, it usually leads to a time-consuming training process. Inspired by the initialization thought of optimization algorithms, we propose a method of initialization for accelerating self-supervised learning of robotic grasp. It pre-trains the Q-network by the supervised learning of affordance maps before the robotic grasp training. When applying the pre-trained Q-network a robot can be trained through self-supervised trial-and-error in a purposeful style to avoid meaningless grasping in empty regions. The Q-network is pre-trained by supervised learning on a small dataset with coarse-grained labels. We test the proposed method with Mean Square Error, Smooth L1, and Kullback-Leibler Divergence (KLD) as loss functions in the pre-training phase. The results indicate that the KLD loss function can predict accurately affordances with less noise in the empty regions. Also, our method is able to accelerate the self-supervised learning significantly in the early stage and shows little relevance to the sparsity of objects in the workspace.

Jun Li | Yanxu Hou | Zihan Fang | Xuechao Zhang

[1] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[2] Sergey Levine,et al. Continuous Deep Q-Learning with Model-based Acceleration , 2016, ICML.

[3] Marcin Andrychowicz,et al. One-Shot Imitation Learning , 2017, NIPS.

[4] Maria Bauza,et al. A probabilistic data-driven model for planar pushing , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[5] Alex Graves,et al. Asynchronous Methods for Deep Reinforcement Learning , 2016, ICML.

[6] Kenneth Y. Goldberg,et al. Linear Push Policies to Increase Grasp Access for Robot Bin Picking , 2018, 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE).

[7] Shahryar Rahnamayan,et al. A novel population initialization method for accelerating evolutionary algorithms , 2007, Comput. Math. Appl..

[8] Alberto Rodriguez,et al. Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9] Ian Taylor,et al. Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[10] Abhinav Gupta,et al. Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[11] Jan Peters,et al. A Survey on Policy Search for Robotics , 2013, Found. Trends Robotics.

[12] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.

[13] Sergey Levine,et al. Trust Region Policy Optimization , 2015, ICML.

[14] Jonathan Lee,et al. Constraint Estimation and Derivative-Free Recovery for Robot Learning from Demonstrations , 2018, 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE).

[15] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[16] Anis Sahbani,et al. An overview of 3D object grasp synthesis algorithms , 2012, Robotics Auton. Syst..

[17] Andreas Pott,et al. Using Neural Networks for Heuristic Grasp Planning in Random Bin Picking , 2018, 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE).

[18] Sergey Levine,et al. Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection , 2016, ISER.

[19] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.

[20] Honglak Lee,et al. Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[21] Yingwu Chen,et al. A knowledge-based initialization technique of genetic algorithm for the travelling salesman problem , 2015, 2015 11th International Conference on Natural Computation (ICNC).

[22] Kaiming He,et al. Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).