Thermal imaging pedestrian detection algorithm based on attention guidance and local cross-level network

Abstract. Pedestrian-related accidents are more frequent at night when visible (VI) cameras are inefficient. Compared with VI cameras, thermal cameras work better in this particular environment. Conversely, thermal images have several drawbacks, such as high noise, low-resolution, less detailed information, and susceptibility to ambient temperature. To overcome these shortcomings, an improved algorithm based on you only look once version 3 (YOLOv3) is proposed. First, the number and size of anchors are obtained using k-means++, which makes the shape of the anchors more suitable for detecting the target. Second, the attention module is added to the backbone network, which is helpful with extracting better feature maps from low-quality thermal images. Finally, the improved atrous spatial pyramid pooling module is added to the back of the backbone network to enable the extracted feature maps to contain more multi-scale information and context information. Experiments on the computer vision center-09 dataset show that the average precision is 86.1%, which is 3.5% higher than YOLOv3 and 0.8% higher than YOLOv4. The detection speed reaches 48 FPS. The results show that the improved algorithm has good accuracy and generalization.

[1]  Guofa Li,et al.  Deep Learning Approaches on Pedestrian Detection in Hazy Weather , 2020, IEEE Transactions on Industrial Electronics.

[2]  Zhenbing Liu,et al.  A New Region Proposal Network for Far-Infrared Pedestrian Detection , 2019, IEEE Access.

[3]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Chen Ning,et al.  Survey of pedestrian detection with occlusion , 2020, Complex & Intelligent Systems.

[5]  Miran Pobar,et al.  Thermal Object Detection in Difficult Weather Conditions Using YOLO , 2020, IEEE Access.

[6]  Bernt Schiele,et al.  A Diverse Dataset for Pedestrian Detection , .

[7]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Xuelong Li,et al.  Detection of Sudden Pedestrian Crossings for Driving Assistance Systems , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[9]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[10]  Umesh C. Pati,et al.  NIR image based pedestrian detection in night vision with cascade classification and validation , 2014, 2014 IEEE International Conference on Advanced Communications, Control and Computing Technologies.

[11]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[12]  Chengjie Bai,et al.  An improved one-stage pedestrian detection method based on multi-scale attention feature extraction , 2021, Journal of Real-Time Image Processing.

[13]  Yupin Luo,et al.  Real-Time Pedestrian Detection and Tracking at Nighttime for Driver-Assistance Systems , 2009, IEEE Transactions on Intelligent Transportation Systems.

[14]  Jian Yang,et al.  Occluded Pedestrian Detection Through Guided Attention in CNNs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Yu Gu,et al.  Computer-Aided Diagnosis of Alzheimer’s Disease through Weak Supervision Deep Learning Framework with Attention Mechanism , 2020, Sensors.

[16]  Jingdao Chen,et al.  CNN-Based Person Detection Using Infrared Images for Night-Time Intrusion Warning Systems , 2019, Sensors.

[17]  Euntai Kim,et al.  Efficient Pedestrian Detection at Nighttime Using a Thermal Camera , 2017, Sensors.

[18]  Miley W. Merkhofer,et al.  An Evaluation of the State of the Art , 1993 .

[19]  Yun Zhang,et al.  Attention guided neural network models for occluded pedestrian detection , 2020, Pattern Recognit. Lett..

[20]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[23]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Bernt Schiele,et al.  CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  Xiaogang Wang,et al.  Joint Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[27]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[28]  David Vázquez,et al.  Random Forests of Local Experts for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[29]  Traffic safety facts 2011 data--pedestrians. , 2013, Annals of emergency medicine.

[30]  Stephen Lin,et al.  An Empirical Study of Spatial Attention Mechanisms in Deep Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31]  Byoung Chul Ko,et al.  Early Detection of Sudden Pedestrian Crossing for Safe Driving During Summer Nights , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[32]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[33]  Takeshi Nagasaki,et al.  Improving the Visibility of Nighttime Images for Pedestrian Recognition using In-Vehicle Camera , 2020 .

[34]  Jun-Wei Hsieh,et al.  CSPNet: A New Backbone that can Enhance Learning Capability of CNN , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[35]  Gang Wang,et al.  Graininess-Aware Deep Feature Learning for Pedestrian Detection , 2018, ECCV.

[36]  Hyunchul Shin,et al.  Pedestrian Detection at Night in Infrared Images Using an Attention-Guided Encoder-Decoder Convolutional Neural Network , 2020, Applied Sciences.

[37]  Xia Liu,et al.  Pedestrian detection and tracking with night vision , 2005, IEEE Transactions on Intelligent Transportation Systems.

[38]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[40]  Ramakant Nevatia,et al.  Pedestrian Detection in Infrared Images based on Local Shape Features , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Cordelia Schmid,et al.  Human Detection Based on a Probabilistic Assembly of Robust Part Detectors , 2004, ECCV.

[42]  Jun Miao,et al.  A scale-adaptive object-tracking algorithm with occlusion detection , 2020, EURASIP Journal on Image and Video Processing.

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Qilong Wang,et al.  ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[46]  ByoungChul Ko,et al.  Pedestrian Detection at Night Using Deep Neural Networks and Saliency Maps , 2017 .

[47]  Quoc V. Le,et al.  EfficientDet: Scalable and Efficient Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[48]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.