Event-Based Pedestrian Detection Using Dynamic Vision Sensors

Pedestrian detection has attracted great research attention in video surveillance, traffic statistics, and especially in autonomous driving. To date, almost all pedestrian detection solutions are derived from conventional framed-based image sensors with limited reaction speed and high data redundancy. Dynamic vision sensor (DVS), which is inspired by biological retinas, efficiently captures the visual information with sparse, asynchronous events rather than dense, synchronous frames. It can eliminate redundant data transmission and avoid motion blur or data leakage in high-speed imaging applications. However, it is usually impractical to directly apply the event streams to conventional object detection algorithms. For this issue, we first propose a novel event-to-frame conversion method by integrating the inherent characteristics of events more efficiently. Moreover, we design an improved feature extraction network that can reuse intermediate features to further reduce the computational effort. We evaluate the performance of our proposed method on a custom dataset containing multiple real-world pedestrian scenes. The results indicate that our proposed method raised its pedestrian detection accuracy by about 5.6–10.8%, and its detection speed is nearly 20% faster than previously reported methods. Furthermore, it can achieve a processing speed of about 26 FPS and an AP of 87.43% when implanted on a single CPU so that it fully meets the requirement of real-time detection.

[1]  Yongdong Zhang,et al.  STAT: Spatial-Temporal Attention Mechanism for Video Captioning , 2020, IEEE Transactions on Multimedia.

[2]  Ling Shao,et al.  Deep Learning for Person Re-Identification: A Survey and Outlook , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[4]  Bernabé Linares-Barranco,et al.  A 3.6 $\mu$ s Latency Asynchronous Frame-Free Event-Driven Dynamic-Vision-Sensor , 2011, IEEE Journal of Solid-State Circuits.

[5]  Chiara Bartolozzi,et al.  Event-Based Vision: A Survey , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Anirban Chakraborty,et al.  Neuromorphic vision: From sensors to event‐based algorithms , 2019, WIREs Data Mining Knowl. Discov..

[7]  Wei Fang,et al.  Leaky Integrate-and-Fire Spiking Neuron with Learnable Membrane Time Parameter , 2020, ArXiv.

[8]  Alois Knoll,et al.  Multi-Cue Event Information Fusion for Pedestrian Detection With Neuromorphic Vision Sensors , 2019, Front. Neurorobot..

[9]  Luping Shi,et al.  Deep representation via convolutional neural network for classification of spatiotemporal event streams , 2018, Neurocomputing.

[10]  Yingping Huang,et al.  A Survey on Deep Learning Based Approaches for Scene Understanding in Autonomous Driving , 2021, Electronics.

[11]  Yiquan Wu,et al.  A Parallel Convolutional Neural Network for Pedestrian Detection , 2020 .

[12]  Frank Hutter,et al.  SGDR: Stochastic Gradient Descent with Warm Restarts , 2016, ICLR.

[13]  Jiyoung Jung,et al.  Real-Time Road Lane Detection in Urban Areas Using LiDAR Data , 2018, Electronics.

[14]  Alois Knoll,et al.  Neuromorphic Vision Datasets for Pedestrian Detection, Action Recognition, and Fall Detection , 2019, Front. Neurorobot..