View-independent recognition of grasping actions with a cortex-inspired model

To recognize how people interact with objects is essential for humans and artificial systems like robots. However, this recognition task is difficult and requires the capturing of the details of effector and goal object under a wide range of image transformations, such as view or position changes. Here, we demonstrate how specific effector-object interactions can be efficiently recognized by a simple, biologically plausible neural model. In line with biological evidence, the model applies a view-based approach for the recognition of grasping sequences from videos. The model generalizes to untrained views by interpolation between stored example views. In addition, it presents a novel physiologically plausible mechanism to capture the spatial relationship between effector and object. The results support the view that where and how an object will be grasped by an agent can be predicted without estimation of the 3D structure of the scene.

[1]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Michael A. Arbib,et al.  Modeling parietal-premotor interactions in primate control of grasping , 1998, Neural Networks.

[3]  B. Schiele,et al.  Combined Object Categorization and Segmentation With an Implicit Shape Model , 2004 .

[4]  Danica Kragic,et al.  Grasp Recognition for Programming by Demonstration , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[5]  Tom M. Mitchell,et al.  Feature selection for grasp recognition from optical markers , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[6]  T. Poggio,et al.  Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.

[7]  N. Logothetis,et al.  Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.

[8]  Stan Sclaroff,et al.  Estimating 3D hand pose from a cluttered image , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  A. Mukovskiy,et al.  A dynamic model for action understanding and goal-directed imitation , 2006, Brain Research.

[10]  D I Perrett,et al.  Frameworks of analysis for the neural representation of animate objects and actions. , 1989, The Journal of experimental biology.

[11]  Martin A. Giese,et al.  Neural Model for the Visual Recognition of Goal-Directed Movements , 2008, ICANN.

[12]  G. Rizzolatti,et al.  Action recognition in the premotor cortex. , 1996, Brain : a journal of neurology.

[13]  K. Zhang,et al.  Representation of spatial orientation by the intrinsic dynamics of the head-direction cell ensemble: a theory , 1996, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[14]  Danica Kragic,et al.  Simultaneous Visual Recognition of Manipulation Actions and Manipulated Objects , 2008, ECCV.

[15]  Björn Stenger,et al.  Model-based hand tracking using a hierarchical Bayesian filter , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Michael Arbib,et al.  Extending the mirror neuron system model, I. Audible actions and invisible grasps. , 2007, Biological cybernetics.

[17]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  T. Poggio,et al.  Cognitive neuroscience: Neural mechanisms for the recognition of biological movements , 2003, Nature Reviews Neuroscience.

[19]  Michael A. Arbib,et al.  Mirror neurons and imitation: A computationally guided review , 2006, Neural Networks.

[20]  David G. Lowe,et al.  Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[21]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[22]  James M. Rehg,et al.  A Scalable Approach to Activity Recognition based on Object Use , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Hiroshi Murase,et al.  Visual learning and recognition of 3-d objects from appearance , 2005, International Journal of Computer Vision.

[24]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[25]  Roberto Prevete,et al.  A connectionist architecture for view-independent grip-aperture computation , 2008, Brain Research.

[26]  Thomas Serre,et al.  A Biologically Inspired System for Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[27]  Thomas Serre,et al.  Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Ying Wu,et al.  Vision-Based Gesture Recognition: A Review , 1999, Gesture Workshop.

[29]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[30]  Thomas Serre,et al.  Categorization by Learning and Combining Object Parts , 2001, NIPS.

[31]  Martin A. Giese,et al.  Bio-inspired Approach for the Recognition of Goal-Directed Hand Actions , 2009, CAIP.

[32]  Danica Kragic,et al.  Visual recognition of grasps for human-to-robot mapping , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[33]  José Santos-Victor,et al.  Visual transformations in gesture imitation: what you see is what you do , 2003, 2003 IEEE International Conference on Robotics and Automation (Cat. No.03CH37422).

[34]  Aude Billard,et al.  Imitation : a review , 2002 .

[35]  Patrick Pérez,et al.  Retrieving actions in movies , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[36]  M. Giese,et al.  Nonlinear dynamics of direction-selective recurrent neural media. , 2002, Physical review. E, Statistical, nonlinear, and soft matter physics.

[37]  Michael A. Arbib,et al.  Schema design and implementation of the grasp-related mirror neuron system , 2002, Biological Cybernetics.

[38]  Surendra Ranganath,et al.  Real-time gesture recognition system and application , 2002, Image Vis. Comput..

[39]  G. Sandini,et al.  Understanding mirror neurons. , 2006 .