Robotic Hand Pose Estimation Based on Stereo Vision and GPU-enabled Internal Graphical Simulation

Humanoid robots have complex kinematic chains whose modeling is error prone. If the robot model is not well calibrated, its hand pose cannot be determined precisely from the encoder readings, and this affects reaching and grasping accuracy. In our work, we propose a novel method to simultaneously i) estimate the pose of the robot hand, and ii) calibrate the robot kinematic model. This is achieved by combining stereo vision, proprioception, and a 3D computer graphics model of the robot. Notably, the use of GPU programming allows to perform the estimation and calibration in real time during the execution of arm reaching movements. Proprioceptive information is exploited to generate hypotheses about the visual appearance of the hand in the camera images, using the 3D computer graphics model of the robot that includes both kinematic and texture information. These hypotheses are compared with the actual visual input using particle filtering, to obtain both i) the best estimate of the hand pose and ii) a set of joint offsets to calibrate the kinematics of the robot model. We evaluate two different approaches to estimate the 6D pose of the hand from vision (silhouette segmentation and edges extraction) and show experimentally that the pose estimation error is considerably reduced with respect to the nominal robot model. Moreover, the GPU implementation ensures a performance about 3 times faster than the CPU one, allowing real-time operation.

[1]  D. Ashmead,et al.  The development of anticipatory hand orientation during infancy. , 1984, Journal of experimental child psychology.

[2]  Gunilla Borgefors,et al.  Distance transformations in digital images , 1986, Comput. Vis. Graph. Image Process..

[3]  John F. Canny,et al.  A Computational Approach to Edge Detection , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  A. Mathew,et al.  The control of reaching movements by young infants. , 1990, Child development.

[5]  M. E. McCarty,et al.  Visual guidance in infants' reaching toward suddenly displaced targets. , 1993, Child development.

[6]  Tom Davis,et al.  Opengl programming guide: the official guide to learning opengl , 1993 .

[7]  Trevor F. Cox,et al.  Multidimensional Scaling, Second Edition , 2000 .

[8]  Konrad Paul Kording,et al.  Bayesian integration in sensorimotor learning , 2004, Nature.

[9]  Wolfram Burgard,et al.  Probabilistic Robotics (Intelligent Robotics and Autonomous Agents) , 2005 .

[10]  Éric Marchand,et al.  Real-time markerless tracking for augmented reality: the virtual visual servoing framework , 2006, IEEE Transactions on Visualization and Computer Graphics.

[11]  Yakup Genc,et al.  GPU-based Video Feature Tracking And Matching , 2006 .

[12]  Giorgio Metta,et al.  YARP: Yet Another Robot Platform , 2006 .

[13]  Fredrik Gustafsson,et al.  On Resampling Algorithms for Particle Filters , 2006, 2006 IEEE Nonlinear Statistical Signal Processing Workshop.

[14]  Mircea Nicolescu,et al.  Vision-based hand pose estimation: A review , 2007, Comput. Vis. Image Underst..

[15]  Kevin Skadron,et al.  Scalable parallel programming , 2008, 2008 IEEE Hot Chips 20 Symposium (HCS).

[16]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[17]  Jing Huang,et al.  Low-cost, high-speed computer vision using NVIDIA's CUDA architecture , 2008, 2008 37th IEEE Applied Imagery Pattern Recognition Workshop.

[18]  Stefan Ulbrich,et al.  Rapid learning of humanoid body schemas with Kinematic Bézier Maps , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[19]  Jovan Popović,et al.  Real-time hand-tracking with a color glove , 2009, SIGGRAPH 2009.

[20]  Giulio Sandini,et al.  The iCub humanoid robot: An open-systems platform for research in cognitive development , 2010, Neural Networks.

[21]  Stefano Soatto,et al.  Really Quick Shift: Image Segmentation on a GPU , 2010, ECCV Workshops.

[22]  Matei T. Ciocarlie,et al.  Towards Reliable Grasping and Manipulation in Household Environments , 2010, ISER.

[23]  Pradeep Dubey,et al.  Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU , 2010, ISCA.

[24]  Alejandro Hernández Arieta,et al.  Body Schema in Robotics: A Review , 2010, IEEE Transactions on Autonomous Mental Development.

[25]  Antonis A. Argyros,et al.  Markerless and Efficient 26-DOF Hand Pose Recovery , 2010, ACCV.

[26]  Danica Kragic,et al.  Virtual Visual Servoing for Real-Time Robot Pose Estimation , 2011 .

[27]  Giorgio Metta,et al.  Online multiple instance learning applied to hand detection in a humanoid robot , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  Antonis A. Argyros,et al.  Efficient model-based 3D tracking of hand articulations using Kinect , 2011, BMVC.

[29]  Henrik I. Christensen,et al.  3D textureless object detection and tracking: An edge-based approach , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  Atsuo Takanishi,et al.  Online calibration of a humanoid robot head from relative encoders, IMU readings and visual data , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Giulio Sandini,et al.  Autonomous Online Learning of Reaching Behavior in a humanoid Robot , 2012, Int. J. Humanoid Robotics.

[32]  Danica Kragic,et al.  Visual servoing on unknown objects , 2012 .

[33]  Stefan Ulbrich,et al.  General Robot Kinematics Decomposition Without Intermediate Markers , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[34]  Atsuo Takanishi,et al.  Incremental development of multiple tool models for robotic reaching through autonomous exploration , 2012, Paladyn J. Behav. Robotics.

[35]  Berthold Bäuml,et al.  Automatic and self-contained calibration of a multi-sensorial humanoid's upper body , 2012, 2012 IEEE International Conference on Robotics and Automation.

[36]  Atsuo Takanishi,et al.  Online learning of humanoid robot kinematics under switching tools contexts , 2013, 2013 IEEE International Conference on Robotics and Automation.

[37]  João Sequeira,et al.  Vision-based Hand Pose Estimation - A Mixed Bottom-up and Top-down Approach , 2013, VISAPP.

[38]  J. Andrew Bagnell,et al.  Closed-loop Servoing using Real-time Markerless Arm Tracking , 2013 .

[39]  Jürgen Leitner,et al.  Humanoid learns to detect its own hands , 2013, 2013 IEEE Congress on Evolutionary Computation.

[40]  Giulio Sandini,et al.  Autonomous online generation of a motor representation of the workspace for intelligent whole-body reaching , 2014, Robotics Auton. Syst..

[41]  Alexandre Bernardino,et al.  Eye-hand online adaptation during reaching tasks in a humanoid robot , 2014, 4th International Conference on Development and Learning and on Epigenetic Robotics.

[42]  Alessandro Roncone,et al.  3D stereo estimation and fully automated learning of eye-hand coordination in humanoid robots , 2014, 2014 IEEE-RAS International Conference on Humanoid Robots.

[43]  Alexandre Bernardino,et al.  GPU-Enabled Particle Based Optimization for Robotic-Hand Pose Estimation and Self-Calibration , 2015, 2015 IEEE International Conference on Autonomous Robot Systems and Competitions.