Deep instance segmentation and 6D object pose estimation in cluttered scenes for robotic autonomous grasping

This paper aims to design a deep neural network for object instance segmentation and six-dimensional (6D) pose estimation in cluttered scenes and apply the proposed method in real-world robotic autonomous grasping of household objects.,A novel deep learning method is proposed for instance segmentation and 6D pose estimation in cluttered scenes. An iterative pose refinement network is integrated with the main network to obtain more robust final pose estimation results for robotic applications. To train the network, a technique is presented to generate abundant annotated synthetic data consisting of RGB-D images and object masks in a fast manner without any hand-labeling. For robotic grasping, the offline grasp planning based on eigengrasp planner is performed and combined with the online object pose estimation.,The experiments on the standard pose benchmarking data sets showed that the method achieves better pose estimation and time efficiency performance than state-of-art methods with depth-based ICP refinement. The proposed method is also evaluated on a seven DOFs Kinova Jaco robot with an Intel Realsense RGB-D camera, the grasping results illustrated that the method is accurate and robust enough for real-world robotic applications.,A novel 6D pose estimation network based on the instance segmentation framework is proposed and a neural work-based iterative pose refinement module is integrated into the method. The proposed method exhibits satisfactory pose estimation and time efficiency for the robotic grasping.

[1]  Peter K. Allen,et al.  Graspit! A versatile simulator for robotic grasping , 2004, IEEE Robotics & Automation Magazine.

[2]  Matei T. Ciocarlie,et al.  Dimensionality reduction for hand-independent dexterous robotic grasping , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[4]  Fuchun Sun,et al.  PointNetGPD: Detecting Grasp Configurations from Point Sets , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[5]  Siddhartha S. Srinivasa,et al.  The YCB object and Model set: Towards common benchmarks for manipulation research , 2015, 2015 International Conference on Advanced Robotics (ICAR).

[6]  Ken Goldberg,et al.  Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Data , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[7]  Tamim Asfour,et al.  Integrated Grasp Planning and Visual Object Localization For a Humanoid Robot with Five-Fingered Hands , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  J. Stuelpnagel On the Parametrization of the Three-Dimensional Rotation Group , 1964 .

[9]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[10]  Guanglong Du,et al.  Ensuring safety in human-robot coexisting environment based on two-level protection , 2016, Ind. Robot.

[11]  Xin Liu,et al.  Markerless Human–Manipulator Interface Using Leap Motion With Interval Kalman Filter and Improved Particle Filter , 2016, IEEE Transactions on Industrial Informatics.

[12]  Ping Zhang,et al.  Human–Manipulator Interface Based on Multisensory Process via Kalman Filters , 2014, IEEE Transactions on Industrial Electronics.

[13]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[14]  Guanglong Du,et al.  A Markerless Human–Robot Interface Using Particle Filter and Kalman Filter for Dual Robots , 2015, IEEE Transactions on Industrial Electronics.