Deep Feature Fusion by Competitive Attention for Pedestrian Detection

Pedestrian detection is a key problem for automatic driving, and the results have been improved significantly via deep convolutional networks. However, there is still room to improve the performance of pedestrian detection by carefully dealing with some critical issues. To take advantages of more discriminative information for pedestrian detection, we propose a novel architecture to auto-choose semantic as well as specific information among the feature maps at different levels and integrate valuable information among the feature maps in multi-scales. Particularly, our architecture consists of feature maps concatenating in different levels and feature maps integrating with multi-scales. Both the operations are equipped with a competitive attention block. The architecture has the ability to obtain more efficient and discriminating features for pedestrian detection. In comparison with the other prevailing models, our architecture provides superior performance. The promising results achieved through experimentation with this architecture achieve a new state-of-the-art on Caltech dataset.

[1]  Nicu Sebe,et al.  Learning Cross-Modal Deep Representations for Robust Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Armin B. Cremers,et al.  Informed Haar-Like Features Improve Pedestrian Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[5]  Anton van den Hengel,et al.  Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features , 2014, ECCV.

[6]  Xiaogang Wang,et al.  DeepID-Net: Deformable deep convolutional neural networks for object detection , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Joon Hee Han,et al.  Local Decorrelation For Improved Pedestrian Detection , 2014, NIPS.

[9]  Yuning Jiang,et al.  What Can Help Pedestrian Detection? , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Ming Tang,et al.  PCN: Part and Context Information for Pedestrian Detection with CNNs , 2018, BMVC.

[12]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[13]  Meng Zhao,et al.  Identification of Pedestrians From Confused Planar Objects Using Light Field Imaging , 2018, IEEE Access.

[14]  Shuicheng Yan,et al.  Scale-Aware Fast R-CNN for Pedestrian Detection , 2015, IEEE Transactions on Multimedia.

[15]  Xiaoming Liu,et al.  Illuminating Pedestrians via Simultaneous Detection and Segmentation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[16]  Fuchun Sun,et al.  Deep Feature Pyramid Reconfiguration for Object Detection , 2018, ECCV.

[17]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Pietro Perona,et al.  Pedestrian detection: A benchmark , 2009, CVPR.

[19]  Jian Sun,et al.  ExFuse: Enhancing Feature Fusion for Semantic Segmentation , 2018, ECCV.

[20]  Bernt Schiele,et al.  Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[21]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Luc Van Gool,et al.  Seeking the Strongest Rigid Detector , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[26]  David A. McAllester,et al.  Cascade object detection with deformable part models , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Deva Ramanan,et al.  Exploring Weak Stabilization for Motion Feature Extraction , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Francois Bremond,et al.  Dataset Optimization for Real-Time Pedestrian Detection , 2018, IEEE Access.

[31]  Guihua Wen,et al.  Competitive Inner-Imaging Squeeze and Excitation for Residual Network , 2018, ArXiv.

[32]  Xiaogang Wang,et al.  Deep Learning Strong Parts for Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[33]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[34]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[35]  Dariu Gavrila,et al.  Multi-cue Pedestrian Detection and Tracking from a Moving Vehicle , 2007, International Journal of Computer Vision.

[36]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[37]  Bernt Schiele,et al.  Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[39]  Xiaogang Wang,et al.  Jointly Learning Deep Features, Deformable Parts, Occlusion and Classification for Pedestrian Detection , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Anelia Angelova,et al.  Real-Time Pedestrian Detection with Deep Network Cascades , 2015, BMVC.