Low-Power Dynamic Object Detection and Classification With Freely Moving Event Cameras

We present the first purely event-based, energy-efficient approach for dynamic object detection and categorization with a freely moving event camera. Compared to traditional cameras, event-based object recognition systems are considerably behind in terms of accuracy and algorithmic maturity. To this end, this paper presents an event-based feature extraction method devised by accumulating local activity across the image frame and then applying principal component analysis (PCA) to the normalized neighborhood region. Subsequently, we propose a backtracking-free k-d tree mechanism for efficient feature matching by taking advantage of the low-dimensionality of the feature representation. Additionally, the proposed k-d tree mechanism allows for feature selection to obtain a lower-dimensional object representation when hardware resources are limited to implement PCA. Consequently, the proposed system can be realized on a field-programmable gate array (FPGA) device leading to high performance over resource ratio. The proposed system is tested on real-world event-based datasets for object categorization, showing superior classification performance compared to state-of-the-art algorithms. Additionally, we verified the real-time FPGA performance of the proposed object detection method, trained with limited data as opposed to deep learning methods, under a closed-loop aerial vehicle flight mode. We also compare the proposed object categorization framework to pre-trained convolutional neural networks using transfer learning and highlight the drawbacks of using frame-based sensors under dynamic camera motion. Finally, we provide critical insights about the feature extraction method and the classification parameters on the system performance, which aids in understanding the framework to suit various low-power (less than a few watts) application scenarios.

[1]  David G. Lowe,et al.  Shape indexing using approximate nearest-neighbour search in high-dimensional spaces , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Liang Chen,et al.  Scalable scene understanding via saliency consensus , 2019, Soft Comput..

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Richard I. Hartley,et al.  Optimised KD-trees for fast image descriptor matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Continuous Time–Intensity , 2017 .

[7]  Tobi Delbruck,et al.  Real-time classification and sensor fusion with a spiking deep belief network , 2013, Front. Neurosci..

[8]  Shih-Chii Liu,et al.  Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences , 2016, NIPS.

[9]  Tobi Delbrück,et al.  Retinomorphic Event-Based Vision Sensors: Bioinspired Cameras With Spiking Output , 2014, Proceedings of the IEEE.

[10]  Nick Barnes,et al.  Continuous-time Intensity Estimation Using Event Cameras , 2018, ACCV.

[11]  Tobi Delbruck,et al.  Low Latency Event-Based Filtering and Feature Extraction for Dynamic Vision Sensors in Real-Time FPGA Applications , 2019, IEEE Access.

[12]  Garrick Orchard,et al.  Spike context: A neuromorphic descriptor for pattern recognition , 2017, 2017 IEEE Biomedical Circuits and Systems Conference (BioCAS).

[13]  Hong Yang,et al.  DART: Distribution Aware Retinal Transform for Event-Based Cameras , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Loukas P. Petrou,et al.  Expanding a robot's life: Low power object recognition via FPGA-based DCNN deployment , 2018, 2018 7th International Conference on Modern Circuits and Systems Technologies (MOCAST).

[15]  Frédéric Jurie,et al.  Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.

[16]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[17]  Tobi Delbruck,et al.  Robotic goalie with 3 ms reaction time at 4% CPU load using event-based dynamic vision sensor , 2013, Front. Neurosci..

[18]  Xiaojun Zhai,et al.  Automatic Number Plate Recognition on FPGA , 2013, 2013 IEEE 20th International Conference on Electronics, Circuits, and Systems (ICECS).

[19]  Sunil Arya,et al.  Algorithms for fast vector quantization , 1993, [Proceedings] DCC `93: Data Compression Conference.

[20]  Garrick Orchard,et al.  A Noise Filtering Algorithm for Event-Based Asynchronous Change Detection Image Sensors on TrueNorth and Its Implementation on TrueNorth , 2018, Front. Neurosci..

[21]  Usman Ali,et al.  FPGA/soft-processor based real-time object tracking system , 2009, 2009 5th Southern Conference on Programmable Logic (SPL).

[22]  Tobi Delbrück,et al.  Combined frame- and event-based detection and tracking , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[23]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[24]  Tobi Delbrück,et al.  Retinal ganglion cell software and FPGA model implementation for object detection and tracking , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[25]  Ryad Benosman,et al.  Asynchronous Event-Based Visual Shape Tracking for Stable Haptic Feedback in Microrobotics , 2012, IEEE Transactions on Robotics.

[26]  Lina J. Karam,et al.  Understanding how image quality affects deep neural networks , 2016, 2016 Eighth International Conference on Quality of Multimedia Experience (QoMEX).

[27]  Hiroomi Hikawa,et al.  Novel FPGA Implementation of Hand Sign Recognition System With SOM–Hebb Classifier , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Nitish V. Thakor,et al.  HFirst: A Temporal Approach to Object Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Vincent Lepetit,et al.  Speed Invariant Time Surface for Learning to Detect Corner Points With Event-Based Cameras , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Tobi Delbrück,et al.  An embedded AER dynamic vision sensor for low-latency pole balancing , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[31]  Hong Yang,et al.  PCA-RECT: An Energy-efficient Object Detection Approach for Event Cameras , 2018, ACCV Workshops.

[32]  Ryad Benosman,et al.  HATS: Histograms of Averaged Time Surfaces for Robust Event-Based Object Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Jason Schlessman,et al.  Hardware/Software Co-Design of an FPGA-based Embedded Tracking System , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[34]  Toon Goedemé,et al.  On-board real-time tracking of pedestrians on a UAV , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[36]  Marko Tscherepanow,et al.  A saliency map based on sampling an image into random rectangular regions of interest , 2012, Pattern Recognit..

[37]  Ryad Benosman,et al.  Event-based Dynamic Face Detection and Tracking Based on Activity , 2018, ArXiv.

[38]  Garrick Orchard,et al.  HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Davide Scaramuzza,et al.  Low-latency visual odometry using event-based feature tracks , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[40]  Serge J. Belongie,et al.  Object categorization using co-occurrence, location and appearance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Tobi Delbrück,et al.  Training Deep Spiking Neural Networks Using Backpropagation , 2016, Front. Neurosci..

[42]  Chiara Bartolozzi,et al.  Towards Event-Driven Object Detection with Off-the-Shelf Deep Learning , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[43]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.

[44]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[45]  Gregory Cohen,et al.  Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades , 2015, Front. Neurosci..