论文信息 - Event-Based Feature Detection, Recognition and Classification

Event-Based Feature Detection, Recognition and Classification

One of the fundamental tasks underlying much of computer vision is the detection, tracking and recognition of visual features. It is an inherently difficult and challenging problem, and despite the advances in computational power, pixel resolution, and frame rates, even the state-of-the-art methods fall far short of the robustness, reliability and energy consumption of biological vision systems. Silicon retinas, such as the Dynamic Vision Sensor (DVS) and Asynchronous Time-based Imaging Sensor (ATIS), attempt to replicate some of the benefits of biological retinas and provide a vastly different paradigm in which to sense and process the visual world. Tasks such as tracking and object recognition still require the identification and matching of local visual features, but the detection, extraction and recognition of features requires a fundamentally different approach, and the methods that are commonly applied to conventional imaging are not directly applicable. This thesis explores methods to detect features in the spatio-temporal information from event-based vision sensors. The nature of features in such data is explored, and methods to determine and detect features are demonstrated. A framework for detecting, tracking, recognising and classifying features is developed and validated using real-world data and event-based variations of existing computer vision datasets and benchmarks. The results presented in this thesis demonstrate the potential and efficacy of event-based systems. This work provides an in-depth analysis of different event-based methods for object recognition and classification and introduces two feature-based methods. Two learning systems, one event-based and the other iterative, were used to explore the nature and classification ability of these methods. The results demonstrate the viability of event-based classification and the importance and role of motion in event-based feature detection.

Gregory Cohen | Gregory Cohen

[1] F. Massey. The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[2] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.

[3] T. Greville,et al. Some Applications of the Pseudoinverse of a Matrix , 1960 .

[4] H. Lilliefors. On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown , 1967 .

[5] J. MacQueen. Some methods for classification and analysis of multivariate observations , 1967 .

[6] B. Ajne,et al. A simple test for uniformity of a circular distribution , 1968 .

[7] S. Nagata,et al. An electronic model of the retina , 1970 .

[8] Hans P. Moravec. Obstacle avoidance and navigation in the real world by a seeing robot rover , 1980 .

[9] B. Heller. Circular Statistics in Biology, Edward Batschelet. Academic Press, London & New York (1981), 371, Price $69.50 , 1983 .

[10] J. H. Zar,et al. Biostatistical Analysis (5th Edition) , 1984 .

[11] Christopher G. Harris,et al. A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[12] Carver A. Mead,et al. Neuromorphic electronic systems , 1990, Proc. IEEE.

[13] Nicholas I. Fisher,et al. Statistical Analysis of Circular Data , 1993 .

[14] Carlo Tomasi,et al. Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[15] Misha A. Mahowald,et al. An Analog VLSI System for Stereoscopic Vision , 1994 .

[16] T. Lindeberg. Scale-Space Theory : A Basic Tool for Analysing Structures at Different Scales , 1994 .

[17] Patrick J. Grother,et al. NIST Special Database 19 Handprinted Forms and Characters Database , 1995 .

[18] John Wawrzynek,et al. A multi-sender asynchronous extension to the AER protocol , 1995, Proceedings Sixteenth Conference on Advanced Research in VLSI.

[19] Kwabena Boahen,et al. A retinomorphic vision system , 1996, IEEE Micro.

[20] X. Arreguit,et al. A CMOS motion detector system for pointing devices , 1996, 1996 IEEE International Solid-State Circuits Conference. Digest of TEchnical Papers, ISSCC.

[21] T Bonhoeffer,et al. Orientation selectivity in pinwheel centers in cat striate cortex. , 1997, Science.

[22] Gert Cauwenberghs,et al. An analog VLSI architecture for auditory based feature extraction , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[23] Kwabena Boahen,et al. Communicating neuronal ensembles between neuromorphic chips , 1998 .

[24] Andreas G. Andreou,et al. AER image filtering architecture for vision-processing systems , 1999 .

[25] Rodney J. Douglas,et al. A pulse-coded communications infrastructure for neuromorphic systems , 1999 .

[26] Gill Mould,et al. Directional statistics of the wind and waves , 2000 .

[27] Dariu Gavrila,et al. Pedestrian Detection from a Moving Vehicle , 2000, ECCV.

[28] Kwabena Boahen,et al. Point-to-point connectivity between neuromorphic chips using address events , 2000 .

[29] Gert Cauwenberghs,et al. Analog VLSI spiking neural network with address domain probabilistic synapses , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[30] Tobi Delbrück,et al. Orientation-Selective aVLSI Spiking Neurons , 2001, NIPS.

[31] S. R. Jammalamadaka,et al. Topics in Circular Statistics , 2001 .

[32] A. van Schaik. Building blocks for electronic spiking neural networks. , 2001, Neural networks : the official journal of the International Neural Network Society.

[33] Eugenio Culurciello,et al. High dynamic range, arbitrated address event representation digital imager , 2001, ISCAS 2001. The 2001 IEEE International Symposium on Circuits and Systems (Cat. No.01CH37196).

[34] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .

[35] Jörg Kramer,et al. An on/off transient imager with event-driven, asynchronous read-out , 2002, 2002 IEEE International Symposium on Circuits and Systems. Proceedings (Cat. No.02CH37353).

[36] Charles M. Higgins,et al. A biologically inspired modular VLSI system for visual measurement of self-motion , 2002 .

[37] James J. Little,et al. Mobile Robot Localization and Mapping with Uncertainty using Scale-Invariant Visual Landmarks , 2002, Int. J. Robotics Res..

[38] F. Heitger,et al. A 100×100 pixel silicon retina for gradient extraction with steering filter capabilities and temporal output coding , 2002, IEEE J. Solid State Circuits.

[39] Guang-Bin Huang,et al. Learning capability and storage capacity of two-hidden-layer feedforward networks , 2003, IEEE Trans. Neural Networks.

[40] E. Culurciello,et al. A biomorphic digital image sensor , 2003, IEEE J. Solid State Circuits.

[41] F. Werblin,et al. Rapid global shifts in natural scenes block spiking in specific ganglion cell types , 2003, Nature Neuroscience.

[42] Nikos A. Vlassis,et al. The global k-means clustering algorithm , 2003, Pattern Recognit..

[43] F. Heitger,et al. A 128 × 128 pixel 120-dB dynamic-range vision-sensor chip for image contrast and orientation extraction , 2003, IEEE J. Solid State Circuits.

[44] Inderjit S. Dhillon,et al. Generative model-based clustering of directional data , 2003, KDD '03.

[45] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[46] Shih-Chii Liu,et al. Temporal coding in a silicon network of integrate-and-fire neurons , 2004, IEEE Transactions on Neural Networks.

[47] Chris Eliasmith,et al. Neural Engineering: Computation, Representation, and Dynamics in Neurobiological Systems , 2004, IEEE Transactions on Neural Networks.

[48] Yan Ke,et al. PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[49] Ivan Laptev,et al. Local Descriptors for Spatio-temporal Recognition , 2004, SCVMA.

[50] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[51] Kwabena Boahen,et al. Optic nerve signals in a neuromorphic chip II: testing and results , 2004, IEEE Transactions on Biomedical Engineering.

[52] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[53] Kwabena Boahen,et al. Optic nerve signals in a neuromorphic chip I: Outer and inner retina models , 2004, IEEE Transactions on Biomedical Engineering.

[54] Inderjit S. Dhillon,et al. Clustering on the Unit Hypersphere using von Mises-Fisher Distributions , 2005, J. Mach. Learn. Res..

[55] Serge J. Belongie,et al. Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[56] Timothy K. Horiuchi,et al. An ultrasonic filterbank with spiking neurons , 2005, 2005 IEEE International Symposium on Circuits and Systems.

[57] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[58] R. Etienne-Cummings,et al. Temporal change threshold detection imager , 2005, ISSCC. 2005 IEEE International Digest of Technical Papers. Solid-State Circuits Conference, 2005..

[59] Ramakant Nevatia,et al. Detection of multiple, partially occluded humans in a single image by Bayesian combination of edgelet part detectors , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[60] Ralf Engbert. Microsaccades: A microcosm for research on oculomotor control, attention, and visual perception. , 2006, Progress in brain research.

[61] Luc Van Gool,et al. Object Detection by Contour Segment Networks , 2006, ECCV.

[62] Cordelia Schmid,et al. Human Detection Using Oriented Histograms of Flow and Appearance , 2006, ECCV.

[63] T. Delbruck,et al. A 128 128 120 dB 15 s Latency Asynchronous Temporal Contrast Vision Sensor , 2006 .

[64] Tobi Delbrück,et al. A 128 X 128 120db 30mw asynchronous vision sensor that responds to relative intensity change , 2006, 2006 IEEE International Solid State Circuits Conference - Digest of Technical Papers.

[65] Aly A. Farag,et al. CSIFT: A SIFT Descriptor with Color Invariant Characteristics , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[66] H. Sompolinsky,et al. The tempotron: a neuron that learns spike timing–based decisions , 2006, Nature Neuroscience.

[67] Eugene M. Izhikevich,et al. Polychronization: Computation with Spikes , 2006, Neural Computation.

[68] Chee Kheong Siew,et al. Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[69] Chong-Wah Ngo,et al. Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[70] Craig T. Jin,et al. An Address-Event Vision Sensor for Multiple Transient Object Detection , 2007, IEEE Transactions on Biomedical Circuits and Systems.

[71] Tobi Delbrück,et al. Using FPGA for visuo-motor control with a silicon retina and a humanoid robot , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[72] Larry S. Davis,et al. Hierarchical Part-Template Matching for Human Detection and Segmentation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[73] Thomas Serre,et al. Robust Object Recognition with Cortex-Like Mechanisms , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74] A. Chun,et al. On the brain , 2007, Nature Nanotechnology.

[75] Yannis Stylianou,et al. Stochastic Modeling and Quantization of Harmonic Phases in Speech using Wrapped Gaussian Mixture Models , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[76] George Loizou,et al. Computer vision and pattern recognition , 2007, Int. J. Comput. Math..

[77] Thomas Kubiak,et al. Applying Circular Statistics to the Analysis of Monitoring Data , 2007 .

[78] Ralph Etienne-Cummings,et al. AER Auditory Filtering and CPG for Robot Control , 2007, 2007 IEEE International Symposium on Circuits and Systems.

[79] André van Schaik,et al. AER EAR: A Matched Silicon Cochlea Pair With Address Event Representation Interface , 2005, IEEE Transactions on Circuits and Systems I: Regular Papers.

[80] Timothée Masquelier,et al. Unsupervised Learning of Visual Features through Spike Timing Dependent Plasticity , 2007, PLoS Comput. Biol..

[81] Mubarak Shah,et al. A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[82] Romain Brette,et al. Neuroinformatics Original Research Article Brian: a Simulator for Spiking Neural Networks in Python , 2022 .

[83] Richard Szeliski,et al. Skeletal graphs for efficient structure from motion , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[84] Bernabé Linares-Barranco,et al. Advanced Vision Processing Systems: Spike-Based Simulation and Processing , 2009, ACIVS.

[85] C. Posch,et al. A Microbolometer Asynchronous Dynamic Vision Sensor for LWIR , 2009, IEEE Sensors Journal.

[86] Christopher Hunt,et al. Notes on the OpenSURF Library , 2009 .

[87] Yannis Stylianou,et al. Wrapped Gaussian Mixture Models for Modeling and High-Rate Quantization of Phase Data of Speech , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[88] Tobi Delbrück,et al. An embedded AER dynamic vision sensor for low-latency pole balancing , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[89] Jean-Luc Nagel,et al. An SoC combining a 132dB QVGA pixel array and a 32b DSP/MCU processor for vision applications , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[90] Pierre Yger,et al. PyNN: A Common Interface for Neuronal Network Simulators , 2008, Front. Neuroinform..

[91] M.J. Dominguez-Morales,et al. Performance study of synthetic AER generation on CPUs for Real-Time Video based on Spikes , 2009, 2009 International Symposium on Performance Evaluation of Computer & Telecommunication Systems.

[92] Philipp Berens,et al. The circular statistics toolbox for Matlab , 2009 .

[93] Tobi Delbrück,et al. Event-based 64-channel binaural silicon cochlea with Q enhancement mechanisms , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[94] Murat Efe Guney,et al. On the limits of GPU acceleration , 2010 .

[95] Matti Pietikäinen,et al. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, TPAMI-2008-09-0620 1 WLD: A Robust Local Image Descriptor , 2022 .

[96] Eugenio Culurciello,et al. Activity-driven, event-based vision sensors , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[97] Andrzej J. Kasinski,et al. Supervised Learning in Spiking Neural Networks with ReSuMe: Sequence Learning, Classification, and Spike Shifting , 2010, Neural Computation.

[98] Stephan Schraml,et al. Dynamic stereo vision system for real-time tracking , 2010, Proceedings of 2010 IEEE International Symposium on Circuits and Systems.

[99] Ling Shao,et al. Recent advances and trends in visual tracking: A review , 2011, Neurocomputing.

[100] C. P. A C O R E T,et al. Asynchronous event-based high speed vision for microparticle tracking , 2011 .

[101] S. R. Jammalamadaka,et al. Directional Statistics, I , 2011 .

[102] Ryad Benosman,et al. Asynchronous Event-Based Hebbian Epipolar Geometry , 2011, IEEE Transactions on Neural Networks.

[103] Erkki Oja,et al. GPU-accelerated and parallelized ELM ensembles for large-scale regression , 2011, Neurocomputing.

[104] Dianhui Wang,et al. Extreme learning machines: a survey , 2011, Int. J. Mach. Learn. Cybern..

[105] Giacomo Indiveri,et al. Frontiers in Neuromorphic Engineering , 2011, Front. Neurosci..

[106] Gary R. Bradski,et al. ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[107] Fernando Díaz del Río,et al. AER Spiking Neuron Computation on GPUs: The Frame-to-AER Generation , 2011, ICONIP.

[108] Daniel Matolin,et al. A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS , 2011, IEEE Journal of Solid-State Circuits.

[109] H. B. Barlow,et al. Possible Principles Underlying the Transformations of Sensory Messages , 2012 .

[110] Ralph Etienne-Cummings,et al. Real Time Compressive Sensing Video Reconstruction in Hardware , 2012, IEEE Journal on Emerging and Selected Topics in Circuits and Systems.

[111] Suvrit Sra,et al. A short note on parameter approximation for von Mises-Fisher distributions: and a fast implementation of Is(x) , 2012, Comput. Stat..

[112] Chiara Bartolozzi,et al. Asynchronous frameless event-based optical flow , 2012, Neural Networks.

[113] Stefan Schliebs,et al. Span: Spike Pattern Association Neuron for Learning Spatio-Temporal Spike Patterns , 2012, Int. J. Neural Syst..

[114] Ryad Benosman,et al. Asynchronous Event-Based Visual Shape Tracking for Stable Haptic Feedback in Microrobotics , 2012, IEEE Transactions on Robotics.

[115] Stephan Schraml,et al. CARE: A dynamic stereo vision sensor system for fall detection , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[116] Craig T. Jin,et al. Neuromorphic Audio–Visual Sensor Fusion on a Sound-Localizing Robot , 2012, Front. Neurosci..

[117] Vincent Lepetit,et al. BRIEF: Computing a Local Binary Descriptor Very Fast , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[118] Hossein Ebrahimnezhad,et al. Accurate object detection using local shape descriptors , 2013, Pattern Analysis and Applications.

[119] Hongming Zhou,et al. Silicon spiking neurons for hardware implementation of extreme learning machines , 2013, Neurocomputing.

[120] André van Schaik,et al. Learning the pseudoinverse solution to network weights , 2012, Neural Networks.

[121] Nitish V. Thakor,et al. A spiking neural network architecture for visual motion estimation , 2013, 2013 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[122] Tobi Delbruck,et al. Robotic goalie with 3 ms reaction time at 4% CPU load using event-based dynamic vision sensor , 2013, Front. Neurosci..

[123] Fei Han,et al. An improved evolutionary extreme learning machine based on particle swarm optimization , 2013, Neurocomputing.

[124] Ryad Benosman,et al. Event-based 3D reconstruction from neuromorphic retinas , 2013, Neural Networks.

[125] Bernabé Linares-Barranco,et al. Mapping from Frame-Driven to Frame-Free Event-Driven Vision Systems by Low-Rate Rate Coding and Coincidence Processing--Application to Feedforward ConvNets , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[126] Jim D. Garside,et al. SpiNNaker: A 1-W 18-Core System-on-Chip for Massively-Parallel Neural Network Simulation , 2013, IEEE Journal of Solid-State Circuits.

[127] Tobi Delbruck,et al. Real-time classification and sensor fusion with a spiking deep belief network , 2013, Front. Neurosci..

[128] Gregory Cohen,et al. Synthesis of neural networks for spatio-temporal spike pattern recognition and processing , 2013, Front. Neurosci..

[129] J. Long,et al. Peak tornado activity is occurring earlier in the heart of “Tornado Alley” , 2014 .

[130] Chiara Bartolozzi,et al. Event-Based Visual Flow , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[131] Rodrigo Alvarez-Icaza,et al. Neurogrid: A Mixed-Analog-Digital Multichip System for Large-Scale Neural Simulations , 2014, Proceedings of the IEEE.

[132] Gert Cauwenberghs,et al. Event-driven contrastive divergence for spiking neuromorphic systems , 2013, Front. Neurosci..

[133] Ryad Benosman,et al. Accelerated frame-free time-encoded multi-step imaging , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[134] Nitish V. Thakor,et al. Gait event detection through neuromorphic spike sequence learning , 2014, 5th IEEE RAS/EMBS International Conference on Biomedical Robotics and Biomechatronics.

[135] Tobi Delbrück,et al. Real-Time Gesture Interface Based on Event-Driven Processing From Stereo Silicon Retinas , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[136] Huajun Chen,et al. MR-ELM: a MapReduce-based framework for large-scale ELM training in big data era , 2014, Neural Computing and Applications.

[137] Tobi Delbrück,et al. Retinomorphic Event-Based Vision Sensors: Bioinspired Cameras With Spiking Output , 2014, Proceedings of the IEEE.

[138] Bernabe Linares-Barranco,et al. On the use of orientation filters for 3D reconstruction in event-driven stereo vision , 2014, Front. Neurosci..

[139] Giacomo Indiveri,et al. PyNCS: a microkernel for high-level definition and configuration of neuromorphic electronic systems , 2014, Front. Neuroinform..

[140] Ryad Benosman,et al. Simultaneous Mosaicing and Tracking with an Event Camera , 2014, BMVC.

[141] Tobi Delbrück,et al. Live demonstration: The “DAVIS” Dynamic and Active-Pixel Vision Sensor , 2014, 2014 IEEE International Symposium on Circuits and Systems (ISCAS).

[142] R. Service. The brain chip. , 2014, Science.

[143] André van Schaik,et al. Racing to learn: statistical inference and learning in a single spiking neuron with adaptive kernels , 2014, Front. Neurosci..

[144] Bernabé Linares-Barranco,et al. Poker-DVS and MNIST-DVS. Their History, How They Were Made, and Other Details , 2015, Front. Neurosci..

[145] André van Schaik,et al. Online and adaptive pseudoinverse solutions for ELM weights , 2015, Neurocomputing.

[146] Ryad Benosman,et al. Asynchronous Event-Based Multikernel Algorithm for High-Speed Visual Features Tracking , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[147] Sio-Hoi Ieng,et al. Spatiotemporal features for asynchronous event-based data , 2015, Front. Neurosci..

[148] Steve B. Furber,et al. Robustness of spiking Deep Belief Networks to noise and reduced bit precision of neuro-inspired hardware platforms , 2015, Front. Neurosci..

[149] Chiara Bartolozzi,et al. An Asynchronous Neuromorphic Event-Driven Visual Part-Based Shape Tracking , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[150] Steve B. Furber,et al. A framework for plasticity implementation on the SpiNNaker neural architecture , 2015, Front. Neurosci..

[151] Gregory Cohen,et al. ELM solutions for event-based systems , 2014, Neurocomputing.

[152] Gregory Cohen,et al. Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..

[153] Alistair A. Young,et al. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) , 2017, MICCAI 2017.