Learning 6-DoF Grasping and Pick-Place Using Attention Focus

We address a class of manipulation problems where the robot perceives the scene with a depth sensor and can move its end effector in a space with six degrees of freedom -- 3D position and orientation. Our approach is to formulate the problem as a Markov decision process (MDP) with abstract yet generally applicable state and action representations. Finding a good solution to the MDP requires adding constraints on the allowed actions. We develop a specific set of constraints called hierarchical $\text{SE}(3)$ sampling (HSE3S) which causes the robot to learn a sequence of gazes to focus attention on the task-relevant parts of the scene. We demonstrate the effectiveness of our approach on three challenging pick-place tasks (with novel objects in clutter and nontrivial places) both in simulation and on a real robot, even though all training is done in simulation.

[1]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[2]  David Hsu,et al.  SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces , 2008, Robotics: Science and Systems.

[3]  Mahesan Niranjan,et al.  On-line Q-learning using connectionist systems , 1994 .

[4]  Xinyu Liu,et al.  Dex-Net 2.0: Deep Learning to Plan Robust Grasps with Synthetic Point Clouds and Analytic Grasp Metrics , 2017, Robotics: Science and Systems.

[5]  Sergey Levine,et al.  End-to-End Training of Deep Visuomotor Policies , 2015, J. Mach. Learn. Res..

[6]  Gerald Tesauro,et al.  On-line Policy Improvement using Monte-Carlo Search , 1996, NIPS.

[7]  Dana H. Ballard,et al.  Eye Movements for Reward Maximization , 2003, NIPS.

[8]  Kate Saenko,et al.  Grasp Pose Detection in Point Clouds , 2017, Int. J. Robotics Res..

[9]  Honglak Lee,et al.  Deep learning for detecting robotic grasps , 2013, Int. J. Robotics Res..

[10]  Robert Platt,et al.  Viewpoint selection for grasp detection , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  David Hsu,et al.  DESPOT: Online POMDP Planning with Regularization , 2013, NIPS.

[12]  Dana H. Ballard,et al.  Learning to perceive and act by trial and error , 1991, Machine Learning.

[13]  Yun Jiang,et al.  Learning to place new objects , 2011, 2012 IEEE International Conference on Robotics and Automation.

[14]  Markus Vincze,et al.  3DNet: Large-scale object class recognition from CAD models , 2012, 2012 IEEE International Conference on Robotics and Automation.

[15]  Andrew J. Davison,et al.  Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task , 2017, CoRL.

[16]  Marcin Andrychowicz,et al.  Hindsight Experience Replay , 2017, NIPS.

[17]  Yuval Tassa,et al.  Continuous control with deep reinforcement learning , 2015, ICLR.

[18]  Leslie Pack Kaelbling,et al.  Grasping POMDPs , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[19]  Sergey Levine,et al.  Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection , 2016, ISER.

[20]  Sergey Levine,et al.  Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[21]  Alex Graves,et al.  Recurrent Models of Visual Attention , 2014, NIPS.

[22]  Robert Platt,et al.  Pick and Place Without Geometric Object Models , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Joel Veness,et al.  Monte-Carlo Planning in Large POMDPs , 2010, NIPS.

[24]  Ian Taylor,et al.  Robotic pick-and-place of novel objects in clutter with multi-affordance grasping and cross-domain image matching , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[25]  Abhinav Gupta,et al.  Supersizing self-supervision: Learning to grasp from 50K tries and 700 robot hours , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[26]  Shane Legg,et al.  Human-level control through deep reinforcement learning , 2015, Nature.

[27]  Shengyong Chen,et al.  Active vision in robotic systems: A survey of recent developments , 2011, Int. J. Robotics Res..

[28]  Takeo Kanade,et al.  Automated Construction of Robotic Manipulation Programs , 2010 .

[29]  Geoffrey E. Hinton,et al.  Learning to combine foveal glimpses with a third-order Boltzmann machine , 2010, NIPS.