Speed Accuracy Trade-off in Pedestrian and Vehicle Detection Using Localized Big Data

This paper aims to obtain a guide of several experimental object detection architectures for pedestrian detection / vehicle detection (PD/VD) in the South Korean driving environment and to present the trade-off relationship between the mean average precision (mAP) and the frame per second (FPS) values. For these purposes, we generated a Korean vehicle black box front view dataset (KVD) to consider the actual driving environments in South Korea. We then experimented with various configurations of object detection architectures using the KVD. Next, the trade-off relationship for each architecture was summarized and analyzed. This paper presents a guide for choosing PD/VD architectures to achieve a suitable mAP–FPS balance for advanced driver assistance systems (ADAS) and autonomous navigation technologies.

[1]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[2]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[3]  Macario Cordel,et al.  Convolutional neural network for vehicle detection in low resolution traffic videos , 2016, 2016 IEEE Region 10 Symposium (TENSYMP).

[4]  Ling Shao,et al.  DAVE: A Unified Framework for Fast Vehicle Detection and Annotation , 2016, ECCV.

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Min Soo Ko,et al.  Vulnerable pedestrian detection and tracking using deep learning , 2018, 2018 International Conference on Electronics, Information, and Communication (ICEIC).

[7]  Xu Wang,et al.  Pedestrian Detection for Transformer Substation Based on Gaussian Mixture Model and YOLO , 2016, 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC).

[8]  Ngai-Man Cheung,et al.  Image-based vehicle analysis using deep neural network: A systematic study , 2016, 2016 IEEE International Conference on Digital Signal Processing (DSP).

[9]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[11]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[12]  Leonidas J. Guibas,et al.  Frustum PointNets for 3D Object Detection from RGB-D Data , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[15]  Meleckidzedeck Khayesi,et al.  Road traffic injuries in Kenya: Magnitude, causes and status of intervention , 2003, Injury control and safety promotion.

[16]  Santokh Singh,et al.  Critical Reasons for Crashes Investigated in the National Motor Vehicle Crash Causation Survey , 2015 .

[17]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[18]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[19]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jungwon Lee,et al.  Fused DNN: A Deep Neural Network Fusion Approach to Fast and Robust Pedestrian Detection , 2016, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[21]  Yury Vizilter,et al.  Pedestrian detection in video surveillance using fully convolutional YOLO neural network , 2017, Optical Metrology.

[22]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[23]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Sani Irwan Md Salim,et al.  Convolutional Neural Network for Person and Car Detection using YOLO Framework , 2018 .

[25]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[26]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).