Classifying Object Manipulation Actions based on Grasp-types and Motion-Constraints

In this work, we address a challenging problem of fine-grained and coarse-grained recognition of object manipulation actions. Due to the variations in geometrical and motion constraints, there are different manipulations actions possible to perform different sets of actions with an object. Also, there are subtle movements involved to complete most of object manipulation actions. This makes the task of object manipulation action recognition difficult with only just the motion information. We propose to use grasp and motion-constraints information to recognise and understand action intention with different objects. We also provide an extensive experimental evaluation on the recent Yale Human Grasping dataset consisting of large set of 455 manipulation actions. The evaluation involves a) Different contemporary multi-class classifiers, and binary classifiers with one-vs-one multi- class voting scheme, b) Differential comparisons results based on subsets of attributes involving information of grasp and motion-constraints, c) Fine-grained and Coarse-grained object manipulation action recognition based on fine-grained as well as coarse-grained grasp type information, and d) Comparison between Instance level and Sequence level modeling of object manipulation actions. Our results justifies the efficacy of grasp attributes for the task of fine-grained and coarse-grained object manipulation action recognition.

[1]  Leonard S. Haynes,et al.  Robotic assembly by constraints , 1986, Proceedings. 1987 IEEE International Conference on Robotics and Automation.

[2]  Z. Liu,et al.  A real time system for dynamic hand gesture recognition with a depth sensor , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[3]  Yi Li,et al.  Robot Learning Manipulation Action Plans by "Watching" Unconstrained Videos from the World Wide Web , 2015, AAAI.

[4]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[5]  Yiannis Aloimonos,et al.  A Cognitive System for Understanding Human Manipulation Actions , 2014 .

[6]  Ling Shao,et al.  From handcrafted to learned representations for human action recognition: A survey , 2016, Image Vis. Comput..

[7]  Pradeep K. Khosla,et al.  Manipulation task primitives for composing robot skills , 1997, Proceedings of International Conference on Robotics and Automation.

[8]  Darius Burschka,et al.  Effectiveness of Grasp Attributes and Motion-Constraints for Fine-Grained Recognition of Object Manipulation Actions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[9]  Ronald Poppe,et al.  A survey on vision-based human action recognition , 2010, Image Vis. Comput..

[10]  Mubarak Shah,et al.  View-Invariant Representation and Recognition of Actions , 2002, International Journal of Computer Vision.

[11]  Ying Wu,et al.  Robust 3D Action Recognition with Random Occupancy Patterns , 2012, ECCV.

[12]  Antonio Criminisi,et al.  Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning , 2012, Found. Trends Comput. Graph. Vis..

[13]  Aaron F. Bobick,et al.  A State-Based Approach to the Representation and Recognition of Gesture , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Yoichi Sato,et al.  A scalable approach for understanding the visual structures of hand grasps , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[15]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[16]  Deva Ramanan,et al.  Understanding Everyday Hands in Action from RGB-D Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[17]  Danica Kragic,et al.  Visual object-action recognition: Inferring object affordances from human demonstration , 2011, Comput. Vis. Image Underst..

[18]  Hong Wei,et al.  A survey of human motion analysis using depth imagery , 2013, Pattern Recognit. Lett..

[19]  Rama Chellappa,et al.  Human Action Recognition by Representing 3D Skeletons as Points in a Lie Group , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Stefan Carlsson,et al.  Recognizing and Tracking Human Action , 2002, ECCV.

[21]  Arnav Bhavsar,et al.  Scale Invariant Human Action Detection from Depth Cameras Using Class Templates , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Aaron M. Dollar,et al.  The Yale human grasping dataset: Grasp, object, and task data in household and machine shop environments , 2015, Int. J. Robotics Res..

[23]  Aaron M. Dollar,et al.  Analysis of Human Grasping Behavior: Correlating Tasks, Objects and Grasps , 2014, IEEE Transactions on Haptics.

[24]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[25]  Yi Li,et al.  Grasp type revisited: A modern perspective on a classical feature for vision , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Prasoon Goyal,et al.  Local Deep Kernel Learning for Efficient Non-linear SVM Prediction , 2013, ICML.

[27]  Roman Filipovych,et al.  Recognizing primitive interactions by exploring actor-object states , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Amr Sharaf,et al.  Real-Time Multi-scale Action Detection from 3D Skeleton Data , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[29]  Aaron M. Dollar,et al.  Analysis of Human Grasping Behavior: Object Characteristics and Grasp Type , 2014, IEEE Transactions on Haptics.

[30]  Danica Kragic,et al.  The GRASP Taxonomy of Human Grasp Types , 2016, IEEE Transactions on Human-Machine Systems.