Traffic Sign Detection and Recognition Using Multi-Scale Fusion and Prime Sample Attention

Traffic sign detection, though one of the key technologies in intelligent transportation, still has bottleneck in accuracy due to the small size and diversity of traffic signs. To solve this problem, we proposed a two-stage CNN object detection algorithm based on multi-scale feature fusion and prime sample attention. We improved the original Faster R-cnn model in terms of feature extraction and sampling strategy. For feature extraction, to elevate the ability of neural networks to detect small objects, we adopted HRNet as the feature extractor. There are four stages in HRNet - a series of high resolution subnets as the starting point with repeated adding parallel high to low resolution subnets to form other stages. In the whole process, the information in the parallel multi-resolution sub-network is repeatedly exchanged to perform repeated multi-scale fusion. For sampling strategy, we adopted a simple and effective sampling and learning strategy called Prime Sample Attention (PISA), consisting of Importance-based Sample Reweighting (ISR) and Classification Aware Regression Loss (CARL). PISA proposed the concepts of IoU Hierarchical Partial Sorting (IoU-HLR) and Hierarchical Partial Score Sorting (Score-HLR), which sort the importance of positive samples and negative samples in mini-batch respectively. With the proposed method, the training process is focusing on prime samples rather than evenly treat all ones. The algorithm complexity of our method is lower than that of other state-of-the-art. After experiments by TT100K dataset, our method can attain a comparable or even better detection accuracy and robustness.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Alexander S. Ecker,et al.  Benchmarking Robustness in Object Detection: Autonomous Driving when Winter is Coming , 2019, ArXiv.

[3]  Hyeon-Cheol Shin,et al.  Data Augmentation Method of Object Detection for Deep Learning in Maritime Image , 2020, 2020 IEEE International Conference on Big Data and Smart Computing (BigComp).

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[6]  Baoli Li,et al.  Traffic-Sign Detection and Classification in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[8]  Dong Liu,et al.  Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Silvio Savarese,et al.  Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Yu Zhang,et al.  Real-time small traffic sign detection with revised faster-RCNN , 2018, Multimedia Tools and Applications.

[11]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[12]  Yoichi Kageyama,et al.  Method for Extracting Circular Road Signs on the Basis of Scene Image Features , 2010 .

[13]  Hyu Soung Shin,et al.  An Application of a Deep Learning Algorithm for Automatic Detection of Unexpected Accidents Under Bad CCTV Monitoring Conditions in Tunnels , 2019, 2019 International Conference on Deep Learning and Machine Learning in Emerging Applications (Deep-ML).

[14]  Kai Chen,et al.  Prime Sample Attention in Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Shan Yu,et al.  Hot Anchors: A Heuristic Anchors Sampling Method in RCNN-Based Object Detection , 2018, Sensors.

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Dong Liu,et al.  High-Resolution Representations for Labeling Pixels and Regions , 2019, ArXiv.

[19]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[20]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[22]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[23]  Lei Gao,et al.  Deep Reinforcement Learning with Parameterized Action Space for Object Detection , 2018, 2018 IEEE International Symposium on Multimedia (ISM).

[24]  Satoshi Asano,et al.  Estimation of Internal Area in the Circular Road Signs from Color Scene Image , 2004 .

[25]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[27]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[29]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Hironobu Fujiyoshi,et al.  Traffic Sign Recognition Using SIFT Features , 2009 .

[32]  Cyrus Shahabi,et al.  A Deep Learning Approach for Road Damage Detection from Smartphone Images , 2018, 2018 IEEE International Conference on Big Data (Big Data).

[33]  Cheng Wang,et al.  Using mobile laser scanning data for automated extraction of road markings , 2014 .

[34]  He Cheng,et al.  Object Detection of Armored Vehicles Based on Deep Learning in Battlefield Environment , 2017, 2017 4th International Conference on Information Science and Control Engineering (ICISCE).

[35]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[36]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[37]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Surface Variations , 2018, 1807.01697.