Action Recognition with a Bio-inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions

Here we show that reproducing the functional properties of MT cells with various center---surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio---inspired model of the motion pathway in primates: It is a feedforward model restricted to V1-MT cortical layers, cortical cells cover the visual space with a foveated structure and, more importantly, we reproduce some of the richness of center-surround interactions of MT cells. Interestingly, as observed in neurophysiology, our MT cells not only behave like simple velocity detectors, but also respond to several kinds of motion contrasts. Results show that this diversity of motion representation at the MT level is a major advantage for an action recognition task. Defining motion maps as our feature vectors, we used a standard classification method on the Weizmann database: We obtained an average recognition rate of 98.9%, which is superior to the recent results by Jhuang et al. (2007). These promising results encourage us to further develop bio---inspired models incorporating other brain mechanisms and cortical layers in order to deal with more complex videos.

[1]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[2]  Pietro Perona,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[3]  Tae-Kyun Kim,et al.  Learning Motion Categories using both Semantic and Structural Information , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[5]  Thomas Serre,et al.  A Biologically Inspired System for Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  S. Grossberg,et al.  Laminar cortical dynamics of visual form and motion interactions during coherent object motion perception. , 2007, Spatial vision.

[7]  Barbara Caputo,et al.  Local velocity-adapted motion events for spatio-temporal recognition , 2007, Comput. Vis. Image Underst..

[8]  Lihi Zelnik-Manor,et al.  Statistical analysis of dynamic actions , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Jing Liu,et al.  Functional organization of speed tuned neurons in visual area MT. , 2003, Journal of neurophysiology.

[10]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[11]  A. Yuille,et al.  A model for the estimate of local image velocity by cells in the visual cortex , 1990, Proceedings of the Royal Society of London. B. Biological Sciences.

[12]  Eero P. Simoncelli,et al.  A model of neuronal responses in visual area MT , 1998, Vision Research.

[13]  T. Albright,et al.  Adaptive Surround Modulation in Cortical Area MT , 2007, Neuron.

[14]  Flemming Topsøe,et al.  Some inequalities for information divergence and related measures of discrimination , 2000, IEEE Trans. Inf. Theory.

[15]  Pierre Kornprobst,et al.  A Simple Mechanism to Reproduce the Neural Solution of the Aperture Problem in Monkey Area MT , 2008 .

[16]  Alexander Thiele,et al.  Speed skills: measuring the visual speed analyzing properties of primate MT neurons , 2001, Nature Neuroscience.

[17]  M. Lappe,et al.  Visual areas involved in the perception of human movement from dynamic form analysis , 2005, Neuroreport.

[18]  G. Orban,et al.  Spatial heterogeneity of inhibitory surrounds in the middle temporal visual area. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  A. Destexhe,et al.  The high-conductance state of neocortical neurons in vivo , 2003, Nature Reviews Neuroscience.

[21]  John A. Perrone,et al.  A visual motion sensor based on the properties of V1 and MT neurons , 2004, Vision Research.

[22]  T. Sejnowski,et al.  A selection model for motion processing in area MT of primates , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[23]  J. Robson Spatial and Temporal Contrast-Sensitivity Functions of the Visual System , 1966 .

[24]  Nicole C. Rust,et al.  Do We Know What the Early Visual System Does? , 2005, The Journal of Neuroscience.

[25]  Heiko Neumann,et al.  Disambiguating Visual Motion by Form-Motion Interaction—a Computational Model , 2007, International Journal of Computer Vision.

[26]  D. G. Albrecht,et al.  Nonlinear Properties of Visual Cortex Neurons: Temporal Dynamics, Stimulus Selectivity, Neural Performance , 2002 .

[27]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[28]  G. Orban,et al.  The spatial distribution of the antagonistic surround of MT/V5 neurons. , 1997, Cerebral cortex.

[29]  M. Irani,et al.  Event-Based Video Analysis, , 2001 .

[30]  Ronen Basri,et al.  Actions as space-time shapes , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[31]  Robert T. Collins,et al.  Silhouette-based human identification from body shape and gait , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[32]  Nicholas J. Priebe,et al.  The Neural Representation of Speed in Macaque Area MT/V5 , 2003, The Journal of Neuroscience.

[33]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[34]  Eero P. Simoncelli,et al.  How MT cells analyze the motion of visual patterns , 2006, Nature Neuroscience.

[35]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[36]  John K. Tsotsos,et al.  Attending to visual motion , 2005, Comput. Vis. Image Underst..

[37]  T. Poggio,et al.  Cognitive neuroscience: Neural mechanisms for the recognition of biological movements , 2003, Nature Reviews Neuroscience.

[38]  Steven M. Seitz,et al.  View-Invariant Analysis of Cyclic Motion , 1997, International Journal of Computer Vision.

[39]  Maurice Milgram,et al.  Recognition of human behavior by space-time silhouette characterization , 2008, Pattern Recognit. Lett..