Active Perception With Dynamic Vision Sensors. Minimum Saccades With Optimum Recognition

Vision processing with dynamic vision sensors (DVSs) is becoming increasingly popular. This type of a bio-inspired vision sensor does not record static images. The DVS pixel activity relies on the changes in light intensity. In this paper, we introduce a platform for the object recognition with a DVS in which the sensor is installed on a moving pan-tilt unit in a closed loop with a recognition neural network. This neural network is trained to recognize objects observed by a DVS, while the pan-tilt unit is moved to emulate micro-saccades. We show that performing more saccades in different directions can result in having more information about the object, and therefore, more accurate object recognition is possible. However, in high-performance and low-latency platforms, performing additional saccades adds latency and power consumption. Here, we show that the number of saccades can be reduced while keeping the same recognition accuracy by performing intelligent saccadic movements, in a closed action-perception smart loop. We propose an algorithm for smart saccadic movement decisions that can reduce the number of necessary saccades to half, on average, for a predefined accuracy on the N-MNIST dataset. Additionally, we show that by replacing this control algorithm with an artificial neural network that learns to control the saccades, we can also reduce to half the average number of saccades needed for the N-MNIST recognition.

[1]  T. Delbruck,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < 1 , 2022 .

[2]  Matthew Cook,et al.  Unsupervised learning of digit recognition using spike-timing-dependent plasticity , 2015, Front. Comput. Neurosci..

[3]  Ehud Rivlin,et al.  Head Movements for Depth Perception: Praying Mantis versus Pigeon , 2005 .

[4]  Hesham Mostafa,et al.  Supervised Learning Based on Temporal Coding in Spiking Neural Networks , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Tobi Delbrück,et al.  Retinomorphic Event-Based Vision Sensors: Bioinspired Cameras With Spiking Output , 2014, Proceedings of the IEEE.

[6]  Tobi Delbrück,et al.  A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[7]  Lei Deng,et al.  Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks , 2017, Front. Neurosci..

[8]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[9]  Daniel Matolin,et al.  A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS , 2011, IEEE Journal of Solid-State Circuits.

[10]  Tobi Delbrück,et al.  A Low Power, Fully Event-Based Gesture Recognition System , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Timothée Masquelier,et al.  STDP-based spiking deep neural networks for object recognition , 2016, Neural Networks.

[12]  Tobi Delbrück,et al.  CAVIAR: A 45k Neuron, 5M Synapse, 12G Connects/s AER Hardware Sensory–Processing– Learning–Actuating System for High-Speed Visual Object Recognition and Tracking , 2009, IEEE Transactions on Neural Networks.

[13]  Tobi Delbruck,et al.  A Dynamic Vision Sensor With 1% Temporal Contrast Sensitivity and In-Pixel Asynchronous Delta Modulator for Event Encoding , 2015, IEEE Journal of Solid-State Circuits.

[14]  Bernabé Linares-Barranco,et al.  Poker-DVS and MNIST-DVS. Their History, How They Were Made, and Other Details , 2015, Front. Neurosci..

[15]  Garrick Orchard,et al.  Real-time sensory information processing using the TrueNorth Neurosynaptic System , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[16]  Tobi Delbrück,et al.  Training Deep Spiking Neural Networks Using Backpropagation , 2016, Front. Neurosci..

[17]  Garrick Orchard,et al.  HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Kwabena Boahen,et al.  Point-to-point connectivity between neuromorphic chips using address events , 2000 .

[19]  Bernabé Linares-Barranco,et al.  An Event-Driven Classifier for Spiking Neural Networks Fed with Synthetic or Dynamic Vision Sensor Data , 2017, Front. Neurosci..

[20]  B. McNaughton,et al.  Packet-based communication in the cortex , 2015, Nature Reviews Neuroscience.

[21]  Tobi Delbruck,et al.  A Sensitive Dynamic and Active Pixel Vision Sensor for Color or Neural Imaging Applications , 2017, IEEE Transactions on Biomedical Circuits and Systems.

[22]  Tobi Delbrück,et al.  Design of an RGBW color VGA rolling and global shutter dynamic and active-pixel vision sensor , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[23]  Garrick Orchard,et al.  Hybrid Neural Network, An Efficient Low-Power Digital Hardware Implementation of Event-based Artificial Neural Network , 2018, 2018 IEEE International Symposium on Circuits and Systems (ISCAS).

[24]  Shoushun Chen,et al.  Live demonstration: A 768 × 640 pixels 200Meps dynamic vision sensor , 2017, 2017 IEEE International Symposium on Circuits and Systems (ISCAS).

[25]  Garrick Orchard,et al.  Skimming Digits: Neuromorphic Classification of Spike-Encoded Images , 2016, Front. Neurosci..

[26]  Ralf Engbert Microsaccades: A microcosm for research on oculomotor control, attention, and visual perception. , 2006, Progress in brain research.

[27]  Bernabé Linares-Barranco,et al.  A 128$\,\times$ 128 1.5% Contrast Sensitivity 0.9% FPN 3 µs Latency 4 mW Asynchronous Frame-Free Dynamic Vision Sensor Using Transimpedance Preamplifiers , 2013, IEEE Journal of Solid-State Circuits.

[28]  Rufin van Rullen,et al.  Rate Coding Versus Temporal Order Coding: What the Retinal Ganglion Cells Tell the Visual Cortex , 2001, Neural Computation.

[29]  Nitish V. Thakor,et al.  HFirst: A Temporal Approach to Object Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30]  Pierre Kornprobst,et al.  Microsaccades enable efficient synchrony-based coding in the retina: a simulation study , 2016, Scientific Reports.

[31]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[32]  Ryad Benosman,et al.  Real-time event-driven spiking neural network object recognition on the SpiNNaker platform , 2015, 2015 IEEE International Symposium on Circuits and Systems (ISCAS).

[33]  Sungho Kim,et al.  4.1 A 640×480 dynamic vision sensor with a 9µm pixel and 300Meps address-event representation , 2017, 2017 IEEE International Solid-State Circuits Conference (ISSCC).

[34]  Gregory Cohen,et al.  Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..