Visual recognition of grasps for human-to-robot mapping

This paper presents a vision based method for grasp classification. It is developed as part of a Programming by Demonstration (PbD) system for which recognition of objects and pick-and-place actions represent basic building blocks for task learning. In contrary to earlier approaches, no articulated 3D reconstruction of the hand over time is taking place. The indata consists of a single image of the human hand. A 2D representation of the hand shape, based on gradient orientation histograms, is extracted from the image. The hand shape is then classified as one of six grasps by finding similar hand shapes in a large database of grasp images. The database search is performed using Locality Sensitive Hashing (LSH), an approximate k-nearest neighbor approach. The nearest neighbors also give an estimated hand orientation with respect to the camera. The six human grasps are mapped to three Barret hand grasps. Depending on the type of robot grasp, a precomputed grasp strategy is selected. The strategy is further parameterized by the orientation of the hand relative to the object. To evaluate the potential for the method to be part of a robust vision system, experiments were performed, comparing classification results to a baseline of human classification performance. The experiments showed the LSH recognition performance to be comparable to human performance.

[1]  Mark R. Cutkosky,et al.  On grasp choice, grasp models, and the design of hands for manufacturing tasks , 1989, IEEE Trans. Robotics Autom..

[2]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[3]  Takeo Kanade,et al.  Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[4]  William T. Freeman,et al.  Orientation Histograms for Hand Gesture Recognition , 1995 .

[5]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[6]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[7]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[8]  Katsushi Ikeuchi,et al.  Recognition of human task by attention point analysis , 2000, Proceedings. 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000) (Cat. No.00CH37113).

[9]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10]  Aude Billard,et al.  Imitation : a review , 2002 .

[11]  Stan Sclaroff,et al.  Estimating 3D hand pose from a cluttered image , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[12]  Tsukasa Ogasawara,et al.  A hand-pose estimation for vision-based human interfaces , 2003, IEEE Trans. Ind. Electron..

[13]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[14]  Katsushi Ikeuchi,et al.  Grasp recognition using a 3D articulated model and infrared images , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[15]  Ilan Shimshoni,et al.  Mean shift based clustering in high dimensions: a texture classification example , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[16]  José Santos-Victor,et al.  Visual transformations in gesture imitation: what you see is what you do , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[17]  Manolis I. A. Lourakis,et al.  Real-Time Tracking of Multiple Skin-Colored Objects with a Possibly Moving Camera , 2004, ECCV.

[18]  Michael I. Mandel,et al.  Visual Hand Tracking Using Nonparametric Belief Propagation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[19]  Niels da Vitoria Lobo,et al.  Segment-based hand pose estimation , 2005, The 2nd Canadian Conference on Computer and Robot Vision (CRV'05).

[20]  Danica Kragic,et al.  Grasp Recognition for Programming by Demonstration , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[21]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[22]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[23]  Björn Stenger,et al.  Model-based hand tracking using a hierarchical Bayesian filter , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects , 2006, NIPS.

[25]  Aude Billard,et al.  Discriminative and adaptive imitation in uni-manual and bi-manual tasks , 2006, Robotics Auton. Syst..

[26]  Staffan Ekvall,et al.  Robot Task Learning from Human Demonstration , 2007 .

[27]  Tom M. Mitchell,et al.  Feature selection for grasp recognition from optical markers , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[29]  Danica Kragic,et al.  Demonstration-based learning and control for automatic grasping , 2009, Intell. Serv. Robotics.