Pedestrian Detection at Night in Infrared Images Using an Attention-Guided Encoder-Decoder Convolutional Neural Network

Pedestrian-related accidents are much more likely to occur during nighttime when visible (VI) cameras are much less effective. Unlike VI cameras, infrared (IR) cameras can work in total darkness. However, IR images have several drawbacks, such as low-resolution, noise, and thermal energy characteristics that can differ depending on the weather. To overcome these drawbacks, we propose an IR camera system to identify pedestrians at night that uses a novel attention-guided encoder-decoder convolutional neural network (AED-CNN). In AED-CNN, encoder-decoder modules are introduced to generate multi-scale features, in which new skip connection blocks are incorporated into the decoder to combine the feature maps from the encoder and decoder module. This new architecture increases context information which is helpful for extracting discriminative features from low-resolution and noisy IR images. Furthermore, we propose an attention module to re-weight the multi-scale features generated by the encoder-decoder module. The attention mechanism effectively highlights pedestrians while eliminating background interference, which helps to detect pedestrians under various weather conditions. Empirical experiments on two challenging datasets fully demonstrate that our method shows superior performance. Our approach significantly improves the precision of the state-of-the-art method by 5.1% and 23.78% on the Keimyung University (KMU) and Computer Vision Center (CVC)-09 pedestrian dataset, respectively.

[1]  Zhenbing Liu,et al.  A New Region Proposal Network for Far-Infrared Pedestrian Detection , 2019, IEEE Access.

[2]  Shifeng Zhang,et al.  Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd , 2018, ECCV.

[3]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Shuicheng Yan,et al.  Scale-Aware Fast R-CNN for Pedestrian Detection , 2015, IEEE Transactions on Multimedia.

[5]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[6]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[7]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[8]  Shifeng Zhang,et al.  Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Massimo Bertozzi,et al.  Pedestrian detection by means of far-infrared stereo vision , 2007, Comput. Vis. Image Underst..

[12]  Xiaoming Liu,et al.  Illuminating Pedestrians via Simultaneous Detection and Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Shu Wang,et al.  Multispectral Deep Neural Networks for Pedestrian Detection , 2016, BMVC.

[14]  Wilfried Philips,et al.  An Occlusion-Robust Feature Selection Framework in Pedestrian Detection † , 2018, Sensors.

[15]  Haibin Ling,et al.  Salient Object Detection in the Deep Learning Era: An In-Depth Survey , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Dumitru Erhan,et al.  Scalable Object Detection Using Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Dong Liang,et al.  Robust pedestrian detection in thermal infrared imagery using a shape distribution histogram feature and modified sparse representation classification , 2015, Pattern Recognit..

[18]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[19]  Peyman Milanfar,et al.  Linear Support Tensor Machine With LSK Channels: Pedestrian Detection in Thermal Infrared Images , 2016, IEEE Transactions on Image Processing.

[20]  ByoungChul Ko,et al.  Pedestrian Detection at Night Using Deep Neural Networks and Saliency Maps , 2017 .

[21]  Martin Glavin,et al.  Detection of pedestrians in far-infrared automotive night vision using region-growing and clothing distortion compensation , 2010 .

[22]  Yupin Luo,et al.  Real-Time Pedestrian Detection and Tracking at Nighttime for Driver-Assistance Systems , 2009, IEEE Transactions on Intelligent Transportation Systems.

[23]  Namil Kim,et al.  Multispectral pedestrian detection: Benchmark dataset and baseline , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Yueting Zhuang,et al.  DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection , 2015, IEEE Transactions on Image Processing.

[25]  Xia Liu,et al.  Pedestrian detection and tracking with night vision , 2005, IEEE Transactions on Intelligent Transportation Systems.

[26]  Wei Liu,et al.  DSSD : Deconvolutional Single Shot Detector , 2017, ArXiv.

[27]  Xuelong Li,et al.  Detection of Sudden Pedestrian Crossings for Driving Assistance Systems , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[28]  Chuanxin Lan,et al.  Joint Pedestrian and Body Part Detection via Semantic Relationship Learning , 2019, Applied Sciences.

[29]  Yuning Jiang,et al.  Repulsion Loss: Detecting Pedestrians in a Crowd , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Feng Xiao,et al.  Pedestrian object detection with fusion of visual attention mechanism and semantic computation , 2019, Multimedia Tools and Applications.

[32]  ByoungChul Ko,et al.  Detecting humans using luminance saliency in thermal images. , 2012, Optics letters.

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Joseph Redmon,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[35]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[36]  Jingdao Chen,et al.  CNN-Based Person Detection Using Infrared Images for Night-Time Intrusion Warning Systems , 2019, Sensors.

[37]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[40]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Byoung Chul Ko,et al.  Early Detection of Sudden Pedestrian Crossing for Safe Driving During Summer Nights , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[42]  Heiko Neumann,et al.  Fully Convolutional Region Proposal Networks for Multispectral Person Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[43]  Chengyang Li,et al.  Illumination-aware Faster R-CNN for Robust Multispectral Pedestrian Detection , 2018, Pattern Recognit..

[44]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Hyunchul Shin,et al.  Multi-layer fusion techniques using a CNN for multispectral pedestrian detection , 2018, IET Comput. Vis..

[46]  Xiaogang Wang,et al.  Deep Learning Strong Parts for Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).