Robot arm pose estimation through pixel-wise part classification

We propose to frame the problem of marker-less robot arm pose estimation as a pixel-wise part classification problem. As input, we use a depth image in which each pixel is classified to be either from a particular robot part or the background. The classifier is a random decision forest trained on a large number of synthetically generated and labeled depth images. From all the training samples ending up at a leaf node, a set of offsets is learned that votes for relative joint positions. Pooling these votes over all foreground pixels and subsequent clustering gives us an estimate of the true joint positions. Due to the intrinsic parallelism of pixel-wise classification, this approach can run in super real-time and is more efficient than previous ICP-like methods. We quantitatively evaluate the accuracy of this approach on synthetic data. We also demonstrate that the method produces accurate joint estimates on real data despite being purely trained on synthetic data.

[1]  Antonio Morales,et al.  Visual tracking of a jaw gripper based on articulated 3D models for grasping , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Danica Kragic,et al.  Scene Representation and Object Grasping Using Active Vision , 2010 .

[3]  Pierre Alliez,et al.  Computational geometry algorithms library , 2008, SIGGRAPH '08.

[4]  Andreas Uhl,et al.  BlenSor: Blender Sensor Simulation Toolbox , 2011, ISVC.

[5]  Gaurav S. Sukhatme,et al.  An autonomous manipulation system based on force control and optimization , 2014, Auton. Robots.

[6]  Matei T. Ciocarlie,et al.  Contact-reactive grasping of objects with partial shape information , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[7]  Antonio Criminisi,et al.  Decision Forests for Computer Vision and Medical Image Analysis , 2013, Advances in Computer Vision and Pattern Recognition.

[8]  Éric Marchand,et al.  Real-time markerless tracking for augmented reality: the virtual visual servoing framework , 2006, IEEE Transactions on Visualization and Computer Graphics.

[9]  Stefan Schaal,et al.  Probabilistic object tracking using a range camera , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[11]  Dieter Fox,et al.  Manipulator and object tracking for in-hand 3D object modeling , 2011, Int. J. Robotics Res..

[12]  Vincent Lepetit,et al.  Randomized trees for real-time keypoint recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Pierre Alliez,et al.  CGAL - The Computational Geometry Algorithms Library , 2011 .

[14]  Zhuowen Tu,et al.  Auto-context and its application to high-level vision tasks , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Danica Kragic,et al.  Visual servoing on unknown objects , 2012 .

[16]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Luc Van Gool,et al.  Human Pose Estimation Using Body Parts Dependent Joint Regressors , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Stefan Schaal,et al.  Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[20]  Joel W. Burdick,et al.  Combined shape, appearance and silhouette for simultaneous manipulator and object tracking , 2012, 2012 IEEE International Conference on Robotics and Automation.

[21]  Peter Corke,et al.  VISUAL CONTROL OF ROBOT MANIPULATORS – A REVIEW , 1993 .

[22]  Danica Kragic,et al.  Data-Driven Grasp Synthesis—A Survey , 2013, IEEE Transactions on Robotics.

[23]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Rüdiger Dillmann,et al.  Visual servoing for humanoid grasping and manipulation tasks , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[25]  Roberto Cipolla,et al.  Semantic texton forests for image categorization and segmentation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Ken Perlin,et al.  Improving noise , 2002, SIGGRAPH.