Spatial and Temporal Downsampling in Event-Based Visual Classification

As the interest in event-based vision sensors for mobile and aerial applications grows, there is an increasing need for high-speed and highly robust algorithms for performing visual tasks using event-based data. As event rate and network structure have a direct impact on the power consumed by such systems, it is important to explore the efficiency of the event-based encoding used by these sensors. The work presented in this paper represents the first study solely focused on the effects of both spatial and temporal downsampling on event-based vision data and makes use of a variety of data sets chosen to fully explore and characterize the nature of downsampling operations. The results show that both spatial downsampling and temporal downsampling produce improved classification accuracy and, additionally, a lower overall data rate. A finding is particularly relevant for bandwidth and power constrained systems. For a given network containing 1000 hidden layer neurons, the spatially downsampled systems achieved a best case accuracy of 89.38% on N-MNIST as opposed to 81.03% with no downsampling at the same hidden layer size. On the N-Caltech101 data set, the downsampled system achieved a best case accuracy of 18.25%, compared with 7.43% achieved with no downsampling. The results show that downsampling is an important preprocessing technique in event-based visual processing, especially for applications sensitive to power consumption and transmission bandwidth.

[1]  Bernabé Linares-Barranco,et al.  Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate Coding and Coincidence Processing--Application to Feedforward ConvNets , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  T. Delbruck,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < 1 , 2022 .

[3]  Matthew Cook,et al.  Unsupervised learning of digit recognition using spike-timing-dependent plasticity , 2015, Front. Comput. Neurosci..

[4]  Chiara Bartolozzi,et al.  Asynchronous frameless event-based optical flow , 2012, Neural Networks.

[5]  André van Schaik,et al.  Online and adaptive pseudoinverse solutions for ELM weights , 2015, Neurocomputing.

[6]  Fernando Díaz del Río,et al.  AER Spiking Neuron Computation on GPUs: The Frame-to-AER Generation , 2011, ICONIP.

[7]  S. Haykin,et al.  Adaptive Filter Theory , 1986 .

[8]  Ryad Benosman,et al.  Asynchronous Event-Based Multikernel Algorithm for High-Speed Visual Features Tracking , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[9]  Timothée Masquelier,et al.  Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity , 2007, PLoS Comput. Biol..

[10]  Nitish V. Thakor,et al.  HFirst: A Temporal Approach to Object Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Chris Eliasmith,et al.  Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems , 2004, IEEE Transactions on Neural Networks.

[12]  Tobi Delbrück,et al.  Training Deep Spiking Neural Networks Using Backpropagation , 2016, Front. Neurosci..

[13]  Tobi Delbrück,et al.  A 128 X 128 120db 30mw asynchronous vision sensor that responds to relative intensity change , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[14]  Garrick Orchard,et al.  Benchmarking neuromorphic vision: lessons learnt from computer vision , 2015, Front. Neurosci..

[15]  Garrick Orchard,et al.  HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Tobi Delbrück,et al.  Asynchronous Event-Based Binocular Stereo Matching , 2012, IEEE Transactions on Neural Networks and Learning Systems.

[17]  Daniel Matolin,et al.  A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS , 2011, IEEE Journal of Solid-State Circuits.

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Kwabena Boahen,et al.  Point-to-point connectivity between neuromorphic chips using address events , 2000 .

[20]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[21]  Bernabé Linares-Barranco,et al.  Advanced Vision Processing Systems: Spike-Based Simulation and Processing , 2009, ACIVS.

[22]  Tobi Delbrück,et al.  Retinomorphic Event-Based Vision Sensors: Bioinspired Cameras With Spiking Output , 2014, Proceedings of the IEEE.

[23]  Garrick Orchard,et al.  Skimming Digits: Neuromorphic Classification of Spike-Encoded Images , 2016, Front. Neurosci..

[24]  Gregory Cohen,et al.  Synthesis of neural networks for spatio-temporal spike pattern recognition and processing , 2013, Front. Neurosci..

[25]  Bernabé Linares-Barranco,et al.  A 128$\,\times$ 128 1.5% Contrast Sensitivity 0.9% FPN 3 µs Latency 4 mW Asynchronous Frame-Free Dynamic Vision Sensor Using Transimpedance Preamplifiers , 2013, IEEE Journal of Solid-State Circuits.

[26]  M.J. Dominguez-Morales,et al.  Performance study of synthetic AER generation on CPUs for Real-Time Video based on Spikes , 2009, 2009 International Symposium on Performance Evaluation of Computer & Telecommunication Systems.

[27]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.