CIFAR10-DVS: An Event-Stream Dataset for Object Classification

Neuromorphic vision research requires high-quality and appropriately challenging event-stream datasets to support continuous improvement of algorithms and methods. However, creating event-stream datasets is a time-consuming task, which needs to be recorded using the neuromorphic cameras. Currently, there are limited event-stream datasets available. In this work, by utilizing the popular computer vision dataset CIFAR-10, we converted 10,000 frame-based images into 10,000 event streams using a dynamic vision sensor (DVS), providing an event-stream dataset of intermediate difficulty in 10 different classes, named as “CIFAR10-DVS.” The conversion of event-stream dataset was implemented by a repeated closed-loop smooth (RCLS) movement of frame-based images. Unlike the conversion of frame-based images by moving the camera, the image movement is more realistic in respect of its practical applications. The repeated closed-loop image movement generates rich local intensity changes in continuous time which are quantized by each pixel of the DVS camera to generate events. Furthermore, a performance benchmark in event-driven object classification is provided based on state-of-the-art classification algorithms. This work provides a large event-stream dataset and an initial benchmark for comparison, which may boost algorithm developments in even-driven pattern recognition and object classification.

[1]  Tobias Delbrück,et al.  Frame-free dynamic digital vision , 2008 .

[2]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[3]  Bo Zhao,et al.  Bag of Events: An Efficient Probability-Based Feature Extraction Method for AER Image Sensors , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[4]  Tobi Delbruck,et al.  Real-time classification and sensor fusion with a spiking deep belief network , 2013, Front. Neurosci..

[5]  Bernabé Linares-Barranco,et al.  Feedforward Categorization on AER Motion Events Using Cortex-Like Features in a Spiking Neural Network , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[6]  Tobi Delbrück,et al.  A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[7]  Tobi Delbruck,et al.  A Dynamic Vision Sensor With 1% Temporal Contrast Sensitivity and In-Pixel Asynchronous Delta Modulator for Event Encoding , 2015, IEEE Journal of Solid-State Circuits.

[8]  Denis Fize,et al.  Speed of processing in the human visual system , 1996, Nature.

[9]  R. Kass,et al.  Multiple neural spike train data analysis: state-of-the-art and future challenges , 2004, Nature Neuroscience.

[10]  Dimitris Kanellopoulos,et al.  Handling imbalanced datasets: A review , 2006 .

[11]  Kwabena Boahen,et al.  Point-to-point connectivity between neuromorphic chips using address events , 2000 .

[12]  Eugenio Culurciello,et al.  Efficient Feedforward Categorization of Objects and Human Postures with Address-Event Image Sensors , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Timothée Masquelier,et al.  Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity , 2007, PLoS Comput. Biol..

[14]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Garrick Orchard,et al.  Benchmarking neuromorphic vision: lessons learnt from computer vision , 2015, Front. Neurosci..

[17]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[18]  Gregory Cohen,et al.  Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..

[19]  Bernabé Linares-Barranco,et al.  A 128$\,\times$ 128 1.5% Contrast Sensitivity 0.9% FPN 3 µs Latency 4 mW Asynchronous Frame-Free Dynamic Vision Sensor Using Transimpedance Preamplifiers , 2013, IEEE Journal of Solid-State Circuits.

[20]  Pietro Perona,et al.  Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[21]  T. Delbruck,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < 1 , 2022 .

[22]  Bernabé Linares-Barranco,et al.  Poker-DVS and MNIST-DVS. Their History, How They Were Made, and Other Details , 2015, Front. Neurosci..