ES-ImageNet: A Million Event-Stream Classification Dataset for Spiking Neural Networks

With event-driven algorithms, especially spiking neural networks (SNNs), achieving continuous improvement in neuromorphic vision processing, a more challenging event-stream dataset is urgently needed. However, it is well-known that creating an ES-dataset is a time-consuming and costly task with neuromorphic cameras like dynamic vision sensors (DVS). In this work, we propose a fast and effective algorithm termed Omnidirectional Discrete Gradient (ODG) to convert the popular computer vision dataset ILSVRC2012 into its event-stream (ES) version, generating about 1,300,000 frame-based images into ES-samples in 1,000 categories. In this way, we propose an ES-dataset called ES-ImageNet, which is dozens of times larger than other neuromorphic classification datasets at present and completely generated by the software. The ODG algorithm implements image motion to generate local value changes with discrete gradient information in different directions, providing a low-cost and high-speed method for converting frame-based images into event streams, along with Edge-Integral to reconstruct the high-quality images from event streams. Furthermore, we analyze the statistics of ES-ImageNet in multiple ways, and a performance benchmark of the dataset is also provided using both famous deep neural network algorithms and spiking neural network algorithms. We believe that this work shall provide a new large-scale benchmark dataset for SNNs and neuromorphic vision.

[1]  Hongkai Wen,et al.  Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  E. Adrian,et al.  The impulses produced by sensory nerve-endings: Part II. The response of a Single End-Organ. , 2006, The Journal of physiology.

[3]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[4]  Tianshi Chen,et al.  DaDianNao: A Neural Network Supercomputer , 2017, IEEE Transactions on Computers.

[5]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Tobi Delbrück,et al.  A Low Power, Fully Event-Based Gesture Recognition System , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Stefan Leutenegger,et al.  Simultaneous Optical Flow and Intensity Estimation from an Event Camera , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Luping Shi,et al.  CIFAR10-DVS: An Event-Stream Dataset for Object Classification , 2017, Front. Neurosci..

[9]  Lei Deng,et al.  DashNet: A Hybrid Artificial and Spiking Neural Network for High-speed Object Tracking , 2019, ArXiv.

[10]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[11]  Guoqi Li,et al.  LIAF-Net: Leaky Integrate and Analog Fire Network for Lightweight and Efficient Spatiotemporal Information Processing , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Bernard Brezzo,et al.  TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip , 2015, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[13]  J. L. Moigne,et al.  Refining image segmentation by integration of edge and region data , 1992, IEEE Trans. Geosci. Remote. Sens..

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[16]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Benjamin Schrauwen,et al.  Compact hardware liquid state machines on FPGA for real-time speech recognition , 2008, Neural Networks.

[18]  Kaushik Roy,et al.  Towards spike-based machine intelligence with neuromorphic computing , 2019, Nature.

[19]  J. Ewert The neural basis of visually guided behavior. , 1974, Scientific American.

[20]  Matteo Matteucci,et al.  N-ROD: a Neuromorphic Dataset for Synthetic-to-Real Domain Adaptation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[21]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[22]  Yuan Xie,et al.  $L1$ -Norm Batch Normalization for Efficient Training of Deep Neural Networks , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[23]  Lei Deng,et al.  Direct Training for Spiking Neural Networks: Faster, Larger, Better , 2018, AAAI.

[24]  Yuan Xie,et al.  Rethinking the performance comparison between SNNS and ANNS , 2020, Neural Networks.

[25]  Isaac L. Chuang,et al.  Confident Learning: Estimating Uncertainty in Dataset Labels , 2019, J. Artif. Intell. Res..

[26]  T. Martin McGinnity,et al.  PRED18: Dataset and Further Experiments with DAVIS Event Camera in Predator-Prey Robot Chasing , 2018, ArXiv.

[27]  E. Adrian,et al.  The impulses produced by sensory nerve‐endings , 1926 .

[28]  Bernabé Linares-Barranco,et al.  Introduction and Analysis of an Event-Based Sign Language Dataset , 2020, 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020).

[29]  Chiara Bartolozzi,et al.  Event-Based Visual Flow , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[30]  Stefan Leutenegger,et al.  Real-Time 3D Reconstruction and 6-DoF Tracking with an Event Camera , 2016, ECCV.

[31]  Peter Dayan,et al.  Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems , 2001 .

[32]  Kirk Y. W. Scheper,et al.  Unsupervised Learning of a Hierarchical Spiking Neural Network for Optical Flow Estimation: From Events to Global Motion Perception , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Wofgang Maas,et al.  Networks of spiking neurons: the third generation of neural network models , 1997 .

[34]  Davide Scaramuzza,et al.  Video to Events: Recycling Video Datasets for Event Cameras , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jianfeng Gao,et al.  A Human Generated MAchine Reading COmprehension Dataset , 2018 .

[36]  Wolfgang Maass,et al.  Networks of Spiking Neurons: The Third Generation of Neural Network Models , 1996, Electron. Colloquium Comput. Complex..

[37]  Benjamin Recht,et al.  Random Features for Large-Scale Kernel Machines , 2007, NIPS.

[38]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[39]  Yutaka Satoh,et al.  Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[41]  Gregory Cohen,et al.  Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..

[42]  Bernabé Linares-Barranco,et al.  Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate Coding and Coincidence Processing--Application to Feedforward ConvNets , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[43]  Alois Knoll,et al.  Neuromorphic Vision Datasets for Pedestrian Detection, Action Recognition, and Fall Detection , 2019, Front. Neurorobot..

[44]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[45]  Mingguo Zhao,et al.  Towards artificial general intelligence with hybrid Tianjic chip architecture , 2019, Nature.

[46]  Ryad Benosman,et al.  Event-based 3D reconstruction from neuromorphic retinas , 2013, Neural Networks.

[47]  Hongjie Liu,et al.  DVS Benchmark Datasets for Object Tracking, Action Recognition, and Object Recognition , 2016, Front. Neurosci..

[48]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[49]  Hong Wang,et al.  Loihi: A Neuromorphic Manycore Processor with On-Chip Learning , 2018, IEEE Micro.

[50]  Yong Zhang,et al.  A Digital Liquid State Machine With Biologically Inspired Learning and Its Application to Speech Recognition , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[51]  Etienne Perot,et al.  A Large Scale Event-based Detection Dataset for Automotive , 2020, ArXiv.

[52]  Yiannis Andreopoulos,et al.  PIX2NVS: Parameterized conversion of pixel-domain video frames to neuromorphic vision streams , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[53]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.