Split Deep Q-Learning for Robust Object Singulation*

Extracting a known target object from a pile of other objects in a cluttered environment is a challenging robotic manipulation task encountered in many robotic applications. In such conditions, the target object touches or is covered by adjacent obstacle objects, thus rendering traditional grasping techniques ineffective. In this paper, we propose a pushing policy aiming at singulating the target object from its surrounding clutter, by means of lateral pushing movements of both the neighboring objects and the target object until sufficient ’grasping room’ has been achieved. To achieve the above goal we employ reinforcement learning and particularly Deep Qlearning (DQN) to learn optimal push policies by trial and error. A novel Split DQN is proposed to improve the learning rate and increase the modularity of the algorithm. Experiments show that although learning is performed in a simulated environment the transfer of learned policies to a real environment is effective thanks to robust feature selection. Finally, we demonstrate that the modularity of the algorithm allows the addition of extra primitives without retraining the model from scratch.

[1]  James M. Rehg,et al.  Guided pushing for object singulation , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2]  Oliver Brock,et al.  Exploitation of environmental constraints in human and robotic grasping , 2015, Int. J. Robotics Res..

[3]  Zoe Doulgeri,et al.  Human-inspired robotic grasping of flat objects , 2018, Robotics Auton. Syst..

[4]  Kenneth Y. Goldberg,et al.  Linear Push Policies to Increase Grasp Access for Robot Bin Picking , 2018, 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE).

[5]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[6]  Peter Corke,et al.  Closing the Loop for Robotic Grasping: A Real-time, Generative Grasp Synthesis Approach , 2018, Robotics: Science and Systems.

[7]  Silvio Savarese,et al.  Mechanical Search: Multi-Step Retrieval of a Target Object Occluded by Clutter , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[8]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[9]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[10]  S. Srinivasa,et al.  Push-grasping with dexterous hands: Mechanics and a method , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[11]  Kiatos Marios,et al.  Robust object grasping in clutter via singulation , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[12]  Kevin M. Lynch,et al.  Stable Pushing: Mechanics, Controllability, and Planning , 1995, Int. J. Robotics Res..

[13]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[14]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[15]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[16]  Abdeslam Boularias,et al.  Learning to Manipulate Unknown Objects in Clutter by Reinforcement , 2015, AAAI.

[17]  Ken Goldberg,et al.  Learning ambidextrous robot grasping policies , 2019, Science Robotics.

[18]  Dieter Fox,et al.  Interactive singulation of objects from a pile , 2012, 2012 IEEE International Conference on Robotics and Automation.

[19]  Yuval Tassa,et al.  MuJoCo: A physics engine for model-based control , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20]  Wolfram Burgard,et al.  Learning to Singulate Objects using a Push Proposal Network , 2017, ISRR.

[21]  Alberto Rodriguez,et al.  Learning Synergies Between Pushing and Grasping with Self-Supervised Deep Reinforcement Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[22]  Ben J. A. Kröse,et al.  Learning from delayed rewards , 1995, Robotics Auton. Syst..

[23]  Edwin Olson,et al.  AprilTag: A robust and flexible visual fiducial system , 2011, 2011 IEEE International Conference on Robotics and Automation.