Mixed Frame-/Event-Driven Fast Pedestrian Detection

Pedestrian detection has attracted enormous research attention in the field of Intelligent Transportation System (ITS) due to that pedestrians are the most vulnerable traffic participants. So far, almost all pedestrian detection solutions are based on the conventional frame-based camera. However, they cannot perform very well in scenarios with bad light condition and high-speed motion. In this work, a Dynamic and Active Pixel Sensor (DAVIS), whose two channels concurrently output conventional gray-scale frames and asynchronous low-latency temporal contrast events of light intensity, was first used to detect pedestrians in a traffic monitoring scenario. Data from two camera channels were fed into Convolutional Neural Networks (CNNs) including three YOLOv3 models and three YOLO-tiny models to gather bounding boxes of pedestrians with respective confidence map. Furthermore, a confidence map fusion method combining the CNN-based detection results from both DAVIS channels was proposed to obtain higher accuracy. The experiments were conducted on a custom dataset collected on TUM campus. Benefiting from the high speed, low latency and wide dynamic range of the event channel, our method achieved higher frame rate and lower latency than those only using a conventional camera. Additionally, it reached higher average precision by using the fusion approach.

[1]  Nicholas F. Y. Chen Pseudo-Labels for Supervised Learning on Dynamic Vision Sensor Data, Applied to Object Detection Under Ego-Motion , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[2]  Armin B. Cremers,et al.  Informed Haar-Like Features Improve Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Tobi Delbrück,et al.  A 128$\times$ 128 120 dB 15 $\mu$s Latency Asynchronous Temporal Contrast Vision Sensor , 2008, IEEE Journal of Solid-State Circuits.

[4]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Anton van den Hengel,et al.  Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features , 2014, ECCV.

[6]  Bernt Schiele,et al.  Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Abdesselam Bouzerdoum,et al.  Pedestrian sensing using time-of-flight range camera , 2011, CVPR 2011 WORKSHOPS.

[8]  Li-Chen Fu,et al.  Pedestrian detection using histograms of Oriented Gradients of granule feature , 2013, 2013 IEEE Intelligent Vehicles Symposium (IV).

[9]  Narciso García,et al.  Event-Based Vision Meets Deep Learning on Steering Prediction for Self-Driving Cars , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[11]  B. Schiele,et al.  How Far are We from Solving Pedestrian Detection? , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Joshua Powell Pedestrian Detection with Convolutional Neural Networks , 2017 .

[13]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[14]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Shuicheng Yan,et al.  Discriminative local binary patterns for human detection in personal album , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  P. Peixoto,et al.  VISION-BASED PEDESTRIAN DETECTION USING HAAR-LIKE FEATURES , 2007 .

[17]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Imen Jegham,et al.  Pedestrian Detection in Poor Weather Conditions Using Moving Camera , 2017, 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA).

[20]  Tobi Delbruck,et al.  A Sensitive Dynamic and Active Pixel Vision Sensor for Color or Neural Imaging Applications , 2017, IEEE Transactions on Biomedical Circuits and Systems.

[21]  Damien Querlioz,et al.  Extraction of temporally correlated features from dynamic vision sensors with spike-timing-dependent plasticity , 2012, Neural Networks.

[22]  Alois Knoll,et al.  Online Multi-object Tracking-by-Clustering for Intelligent Transportation System with Neuromorphic Vision Sensor , 2017, KI.

[23]  Gianpaolo Francesco Trotta,et al.  Computer vision and deep learning techniques for pedestrian detection and tracking: A survey , 2018, Neurocomputing.

[24]  Yanwen Chong,et al.  Pedestrian detection based on gradient and texture feature integration , 2017, Neurocomputing.

[25]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[26]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  T. Delbruck,et al.  > Replace This Line with Your Paper Identification Number (double-click Here to Edit) < 1 , 2022 .

[28]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Tobi Delbrück,et al.  Combined frame- and event-based detection and tracking , 2016, 2016 IEEE International Symposium on Circuits and Systems (ISCAS).

[30]  Tobi Delbruck,et al.  A 240 × 180 130 dB 3 µs Latency Global Shutter Spatiotemporal Vision Sensor , 2014, IEEE Journal of Solid-State Circuits.