Action Recognition Using a Bio-Inspired Feedforward Spiking Network

We propose a bio-inspired feedforward spiking network modeling two brain areas dedicated to motion (V1 and MT), and we show how the spiking output can be exploited in a computer vision application: action recognition. In order to analyze spike trains, we consider two characteristics of the neural code: mean firing rate of each neuron and synchrony between neurons. Interestingly, we show that they carry some relevant information for the action recognition application. We compare our results to Jhuang et al. (Proceedings of the 11th international conference on computer vision, pp. 1–8, 2007) on the Weizmann database. As a conclusion, we are convinced that spiking networks represent a powerful alternative framework for real vision applications that will benefit from recent advances in computational neuroscience.

[1]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[2]  Martin A. Giese,et al.  Learning Features of Intermediate Complexity for the Recognition of Biological Motion , 2005, ICANN.

[3]  Eero P. Simoncelli,et al.  A model of neuronal responses in visual area MT , 1998, Vision Research.

[4]  Flemming Topsøe,et al.  Some inequalities for information divergence and related measures of discrimination , 2000, IEEE Trans. Inf. Theory.

[5]  Peter Dayan,et al.  Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .

[6]  S. Thorpe,et al.  Surfing a spike wave down the ventral stream , 2002, Vision Research.

[7]  K. Rohr Towards model-based recognition of human movements in image sequences , 1994 .

[8]  Bart G Borghuis,et al.  Temporal dynamics of direction tuning in motion-sensitive macaque area MT. , 2005, Journal of neurophysiology.

[9]  Fei-FeiLi,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2008 .

[10]  David G. Lowe,et al.  Multiclass Object Recognition with Sparse, Localized Features , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Wulfram Gerstner,et al.  Spiking Neuron Models: An Introduction , 2002 .

[12]  J Gautrais,et al.  Rate coding versus temporal order coding: a theoretical approach. , 1998, Bio Systems.

[13]  Pierre Kornprobst,et al.  Action Recognition with a Bio-inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions , 2008, ECCV.

[14]  R. Born Center-surround interactions in the middle temporal visual area of the owl monkey. , 2000, Journal of neurophysiology.

[15]  Victor A. F. Lamme,et al.  Synchrony and covariation of firing rates in the primary visual cortex during contour grouping , 2004, Nature Neuroscience.

[16]  Pierre Kornprobst,et al.  Virtual Retina: A biological retina model and simulator, with contrast gain control , 2009, Journal of Computational Neuroscience.

[17]  Tomaso Poggio,et al.  Learning a dictionary of shape-components in visual cortex: comparison with neurons, humans and machines , 2006 .

[18]  R A Andersen,et al.  The response of area MT and V1 neurons to transparent motion , 1991, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[19]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[20]  Steven M. Seitz,et al.  View-Invariant Analysis of Cyclic Motion , 1997, International Journal of Computer Vision.

[21]  Eric Hiris,et al.  Temporal properties in masking biological motion , 2005, Perception & psychophysics.

[22]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[23]  Maurice Milgram,et al.  Recognition of human behavior by space-time silhouette characterization , 2008, Pattern Recognit. Lett..

[24]  Fiona E. N. LeBeau,et al.  Single-column thalamocortical network model exhibiting gamma oscillations, sleep spindles, and epileptogenic bursts. , 2005, Journal of neurophysiology.

[25]  Thomas Serre,et al.  Object recognition with features inspired by visual cortex , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Antonio Politi,et al.  Measuring spike train synchrony , 2007, Journal of Neuroscience Methods.

[27]  D. W. Wheeler,et al.  Brightness Induction: Rate Enhancement and Neuronal Synchronization as Complementary Codes , 2006, Neuron.

[28]  Tae-Kyun Kim,et al.  Learning Motion Categories using both Semantic and Structural Information , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Lihi Zelnik-Manor,et al.  Event-based analysis of video , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[30]  Ronen Basri,et al.  Actions as Space-Time Shapes , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Guillaume S Masson,et al.  Spatial scale of motion segmentation from speed cues , 2001, Vision Research.

[32]  Tim Gollisch,et al.  Rapid Neural Coding in the Retina with Relative Spike Latencies , 2008, Science.

[33]  Pietro Perona,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[34]  Martin A. Giese,et al.  Roles of Motion and Form in Biological Motion Recognition , 2003, ICANN.

[35]  Bruno Cessac,et al.  To which extend is the "neural code" a metric ? , 2008, ArXiv.

[36]  Leo L. Lui,et al.  Spatial summation, end inhibition and side inhibition in the middle temporal visual area (MT). , 2007, Journal of neurophysiology.

[37]  D. Bradley,et al.  Structure and function of visual area MT. , 2005, Annual review of neuroscience.

[38]  Antonino Casile,et al.  Critical features for the recognition of biological motion. , 2005, Journal of vision.

[39]  Barbara Caputo,et al.  Local velocity-adapted motion events for spatio-temporal recognition , 2007, Comput. Vis. Image Underst..

[40]  T. Albright,et al.  Contribution of area MT to perception of three-dimensional shape: a computational study , 1996, Vision Research.

[41]  Russell L. De Valois,et al.  PII: S0042-6989(00)00210-8 , 2000 .

[42]  Alan B Saul,et al.  Temporal properties of inputs to direction-selective neurons in monkey V1. , 2005, Journal of neurophysiology.

[43]  W. Singer,et al.  Synchronization of neuronal responses in primary visual cortex of monkeys viewing natural images. , 2008, Journal of neurophysiology.

[44]  E H Adelson,et al.  Spatiotemporal energy models for the perception of motion. , 1985, Journal of the Optical Society of America. A, Optics and image science.

[45]  Alan L. Yuille,et al.  A Model for the Estimate of Local Velocity , 1990, ECCV.

[46]  Simon J. Thorpe,et al.  Ultra-Rapid Scene Categorization with a Wave of Spikes , 2002, Biologically Motivated Computer Vision.

[47]  John K. Tsotsos,et al.  Attending to visual motion , 2005, Comput. Vis. Image Underst..

[48]  J A Beintema,et al.  Perception of biological motion without local image motion , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[49]  Jitendra Malik,et al.  Recognizing action at a distance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[50]  P. Goldman-Rakic,et al.  Preface: Cerebral Cortex Has Come of Age , 1991 .

[51]  Mubarak Shah,et al.  Motion-Based Recognition , 1997, Computational Imaging and Vision.

[52]  W. Singer,et al.  Synchronous oscillations in the cat retina , 1999, Vision Research.

[53]  DeLiang Wang,et al.  Locally excitatory globally inhibitory oscillator networks , 1995, IEEE Transactions on Neural Networks.

[54]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[55]  G. Orban,et al.  The spatial distribution of the antagonistic surround of MT/V5 neurons. , 1997, Cerebral cortex.

[56]  Robert T. Collins,et al.  Silhouette-based human identification from body shape and gait , 2002, Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.

[57]  T. Poggio,et al.  Cognitive neuroscience: Neural mechanisms for the recognition of biological movements , 2003, Nature Reviews Neuroscience.

[58]  William Bialek,et al.  Spikes: Exploring the Neural Code , 1996 .

[59]  Heiko Neumann,et al.  Disambiguating Visual Motion by Form-Motion Interaction—a Computational Model , 2007, International Journal of Computer Vision.

[60]  A. Destexhe,et al.  The high-conductance state of neocortical neurons in vivo , 2003, Nature Reviews Neuroscience.

[61]  A. Aertsen,et al.  Spike synchronization and rate modulation differentially involved in motor cortical function. , 1997, Science.

[62]  Simon J. Thorpe,et al.  Spike arrival times: A highly efficient coding scheme for neural networks , 1990 .

[63]  W. Singer,et al.  Rapid feature selective neuronal synchronization through correlated latency shifting , 2001, Nature Neuroscience.

[64]  Seong-Whan Lee,et al.  Biologically Motivated Computer Vision , 2002, Lecture Notes in Computer Science.

[65]  Eero P. Simoncelli,et al.  How MT cells analyze the motion of visual patterns , 2006, Nature Neuroscience.

[66]  T. Sejnowski,et al.  A selection model for motion processing in area MT of primates , 1995, The Journal of neuroscience : the official journal of the Society for Neuroscience.

[67]  J. Robson Spatial and Temporal Contrast-Sensitivity Functions of the Visual System , 1966 .

[68]  Andrew B. Watson,et al.  A look at motion in the frequency domain , 1983 .

[69]  D. J. Felleman,et al.  Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.

[70]  Wulfram Gerstner,et al.  Spiking Neuron Models , 2002 .

[71]  R. Blake,et al.  Perception of human motion. , 2007, Annual review of psychology.

[72]  D. Hubel,et al.  With 2 Plate and 20 Text-ftgutre8 Receptive Fields, Binocular Interaction and Functional Architecture in the Cat's Visual Cortex Cat Visual Cortex Part I Organization of Receptive Fields in Cat's Visual Cortex: Properties of 'simple' and 'complex' Fields Complex Receptive Fields , 2022 .

[73]  J. Victor,et al.  Nature and precision of temporal coding in visual cortex: a metric-space analysis. , 1996, Journal of neurophysiology.

[74]  M.-J. Escobar,et al.  Biological Motion Recognition Using a MT-like Model , 2006, 2006 IEEE 3rd Latin American Robotics Symposium.

[75]  Jean Bullier,et al.  The Timing of Information Transfer in the Visual System , 1997 .

[76]  Liang Wang,et al.  Recognizing Human Activities from Silhouettes: Motion Subspace and Factorial Discriminative Graphical Model , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[77]  Christopher C. Pack,et al.  Contrast dependence of suppressive influences in cortical area MT of alert macaque. , 2005, Journal of neurophysiology.

[78]  Michael Shelley,et al.  How Simple Cells Are Made in a Nonlinear Network Model of the Visual Cortex , 2001, The Journal of Neuroscience.

[79]  Randal C. Nelson,et al.  Detection and Recognition of Periodic, Nonrigid Motion , 1997, International Journal of Computer Vision.

[80]  J. Movshon,et al.  Dynamics of motion signaling by neurons in macaque area MT , 2005, Nature Neuroscience.

[81]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[82]  Eugene M. Izhikevich,et al.  Which model to use for cortical spiking neurons? , 2004, IEEE Transactions on Neural Networks.

[83]  T. Sejnowski,et al.  Discovering Spike Patterns in Neuronal Responses , 2004, The Journal of Neuroscience.

[84]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[85]  S. Thorpe,et al.  Speed of processing in the human visual system , 1996, Nature.

[86]  Thomas Serre,et al.  A Biologically Inspired System for Action Recognition , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[87]  S. Grossberg,et al.  Laminar cortical dynamics of visual form and motion interactions during coherent object motion perception. , 2007, Spatial vision.

[88]  Margaret E. Sereno,et al.  2-D center-surround effects on 3-D structure-from-motion. , 1999, Journal of experimental psychology. Human perception and performance.

[89]  Bevil R. Conway,et al.  Space-time maps and two-bar interactions of different classes of direction-selective cells in macaque V-1. , 2003, Journal of neurophysiology.

[90]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[91]  M. Lappe,et al.  Visual areas involved in the perception of human movement from dynamic form analysis , 2005, Neuroreport.

[92]  G. Orban,et al.  Spatial heterogeneity of inhibitory surrounds in the middle temporal visual area. , 1995, Proceedings of the National Academy of Sciences of the United States of America.