Benchmarking neuromorphic vision: lessons learnt from computer vision

Neuromorphic Vision sensors have improved greatly since the first silicon retina was presented almost three decades ago. They have recently matured to the point where they are commercially available and can be operated by laymen. However, despite improved availability of sensors, there remains a lack of good datasets, while algorithms for processing spike-based visual data are still in their infancy. On the other hand, frame-based computer vision algorithms are far more mature, thanks in part to widely accepted datasets which allow direct comparison between algorithms and encourage competition. We are presented with a unique opportunity to shape the development of Neuromorphic Vision benchmarks and challenges by leveraging what has been learnt from the use of datasets in frame-based computer vision. Taking advantage of this opportunity, in this paper we review the role that benchmarks and challenges have played in the advancement of frame-based computer vision, and suggest guidelines for the creation of Neuromorphic Vision benchmarks and challenges. We also discuss the unique challenges faced when benchmarking Neuromorphic Vision algorithms, particularly when attempting to provide direct comparison with frame-based computer vision.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[3]  Dimitris Kanellopoulos,et al.  Handling imbalanced datasets: A review , 2006 .

[4]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[5]  Nitish V. Thakor,et al.  HFirst: A Temporal Approach to Object Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ashutosh Saxena,et al.  A Fast Data Collection and Augmentation Procedure for Object Recognition , 2008, AAAI.

[7]  Timothée Masquelier,et al.  Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity , 2007, PLoS Comput. Biol..

[8]  Carver Mead,et al.  Analog VLSI and neural systems , 1989 .

[9]  Bernabé Linares-Barranco,et al.  Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate Coding and Coincidence Processing--Application to Feedforward ConvNets , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Xiao-Li Meng,et al.  The Art of Data Augmentation , 2001 .

[11]  Andrew S. Cassidy,et al.  Visual saliency on networks of neurosynaptic cores , 2015, IBM J. Res. Dev..

[12]  D. Hubel,et al.  The role of fixational eye movements in visual perception , 2004, Nature Reviews Neuroscience.

[13]  Ralf Engbert Microsaccades: A microcosm for research on oculomotor control, attention, and visual perception. , 2006, Progress in brain research.

[14]  Daniel Matolin,et al.  A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS , 2011, IEEE Journal of Solid-State Circuits.

[15]  Tobi Delbruck,et al.  A 240×180 10mW 12us latency sparse-output vision sensor for mobile applications , 2013, 2013 Symposium on VLSI Circuits.

[16]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[17]  Ryad Benosman,et al.  Simultaneous Mosaicing and Tracking with an Event Camera , 2014, BMVC.

[18]  Tobi Delbrück,et al.  Retinomorphic Event-Based Vision Sensors: Bioinspired Cameras With Spiking Output , 2014, Proceedings of the IEEE.

[19]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[20]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.