Effectiveness of Grasp Attributes and Motion-Constraints for Fine-Grained Recognition of Object Manipulation Actions

In this work, we consider the problem of recognition of object manipulation actions. This is a challenging task for real everyday actions, as the same object can be grasped and moved in different ways depending on its functions and geometric constraints of the task. We propose to leverage grasp and motion-constraints information, using a suitable representation, to recognize and understand action intention with different objects. We also provide an extensive experimental evaluation on the recent Yale Human Grasping dataset consisting of large set of 455 manipulation actions. The evaluation involves a) Different contemporary multi-class classifiers, and binary classifiers with one-vsone multi-class voting scheme, and b) Differential comparisons results based on subsets of attributes involving information of grasp and motion-constraints. Our results clearly demonstrate the usefulness of grasp characteristics and motion-constraints, to understand actions intended with an object.

[1]  Danica Kragic,et al.  Visual object-action recognition: Inferring object affordances from human demonstration , 2011, Comput. Vis. Image Underst..

[2]  Pradeep K. Khosla,et al.  Manipulation task primitives for composing robot skills , 1997, Proceedings of International Conference on Robotics and Automation.

[3]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[4]  Hong Wei,et al.  A survey of human motion analysis using depth imagery , 2013, Pattern Recognit. Lett..

[5]  Aaron M. Dollar,et al.  Analysis of Human Grasping Behavior: Object Characteristics and Grasp Type , 2014, IEEE Transactions on Haptics.

[6]  Aaron M. Dollar,et al.  The Yale human grasping dataset: Grasp, object, and task data in household and machine shop environments , 2015, Int. J. Robotics Res..

[7]  Aaron M. Dollar,et al.  Analysis of Human Grasping Behavior: Correlating Tasks, Objects and Grasps , 2014, IEEE Transactions on Haptics.

[8]  Stefan Carlsson,et al.  Recognizing and Tracking Human Action , 2002, ECCV.

[9]  Roman Filipovych,et al.  Recognizing primitive interactions by exploring actor-object states , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[11]  Yi Li,et al.  Grasp type revisited: A modern perspective on a classical feature for vision , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Prasoon Goyal,et al.  Local Deep Kernel Learning for Efficient Non-linear SVM Prediction , 2013, ICML.

[13]  Leonard S. Haynes,et al.  Robotic assembly by constraints , 1986, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[14]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[15]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[16]  Ying Wu,et al.  Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[17]  Danica Kragic,et al.  The GRASP Taxonomy of Human Grasp Types , 2016, IEEE Transactions on Human-Machine Systems.

[18]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[19]  Z. Liu,et al.  A real time system for dynamic hand gesture recognition with a depth sensor , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[20]  F BobickAaron,et al.  A State-Based Approach to the Representation and Recognition of Gesture , 1997 .