3D Pose Estimation of Robot Arm with RGB Images Based on Deep Learning

In the field of human-robot interaction, robot collision avoidance with the human in a shared workspace remains a challenge. Many researchers use visual methods to detect the collision between robots and obstacles on the assumption that the robot pose is known because the information about the robot is obtained from the controller and hand-eye calibration is conducted. Therefore, they focus on the motion prediction of obstacles. In this paper, a real-time method based on deep learning is proposed to directly estimate the 3D pose of the robot arm using a color image. The method aims to remove the hand-eye calibration when the system needs to be reconfigured and increase the flexibility of the system by eliminating the requirement that the camera fixed relative to the robot. Our approach has two main contributions. One is that the method estimates the 3D position of the robot base and the relative 3D positions of the predefined key points of the robot to the robot base separately different from other deep learning methods considering the limitations of the dataset. The other is that some datasets are collected through another trained network to avoid tedious calibration process, and the trained network will be reused in the pose estimation task. Finally, the experiments are conducted. The results show that a fully trained system provides an accurate 3D pose estimation for the robot arm in the camera coordinate system. The average errors of the 3D positions of the robot base and the predefined key points are 2.35 cm and 1.99 cm respectively.

[1]  Danica Kragic,et al.  Scene Representation and Object Grasping Using Active Vision , 2010 .

[2]  Olivier Stasse,et al.  Real-time (self)-collision avoidance task on a hrp-2 humanoid robot , 2008, 2008 IEEE International Conference on Robotics and Automation.

[3]  Joel W. Burdick,et al.  Combined shape, appearance and silhouette for simultaneous manipulator and object tracking , 2012, 2012 IEEE International Conference on Robotics and Automation.

[4]  Yichen Wei,et al.  Integral Human Pose Regression , 2017, ECCV.

[5]  Cliff Fitzgerald,et al.  Developing baxter , 2013, 2013 IEEE Conference on Technologies for Practical Robot Applications (TePRA).

[6]  Saeed Yahyanejad,et al.  Multi-Objective Convolutional Neural Networks for Robot Localisation and 3D Position Estimation in 2D Camera Images , 2018, 2018 15th International Conference on Ubiquitous Robots (UR).

[7]  Paul J. Besl,et al.  Method for registration of 3-D shapes , 1992, Other Conferences.

[8]  Minna Lanz,et al.  Review of vision-based safety systems for human-robot collaboration , 2018 .

[9]  Dieter Fox,et al.  Manipulator and object tracking for in-hand 3D object modeling , 2011, Int. J. Robotics Res..

[10]  Hans-Peter Seidel,et al.  VNect , 2017, ACM Trans. Graph..

[11]  Stefan Schaal,et al.  Robot arm pose estimation by pixel-wise regression of joint angles , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[12]  Kyrre Glette,et al.  Automatic calibration of a robot manipulator and multi 3D camera system , 2016, 2016 IEEE/SICE International Symposium on System Integration (SII).

[13]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Alexander Herzog,et al.  Robot arm pose estimation through pixel-wise part classification , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[16]  Kuan-Ting Yu,et al.  Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge , 2016, 2017 IEEE International Conference on Robotics and Automation (ICRA).