ReFPN-FCOS: One-Stage Object Detection for Feature Learning and Accurate Localization

One-stage object detectors are simple and efficient; however, they cannot extract sufficient object features due to simplistic structures. At the same time, the classification score cannot reflect the actual positioning of the candidate box. Therefore, it is not accurate to use classification score only as the candidate box position score in non-maximum suppression (NMS) stage. These two shortcomings degrade the detection accuracy. In this paper, a novel feature pyramid architecture named refined feature pyramid network (ReFPN) is introduced to obtain better object features. ReFPN designs a refined module which is parallel with feature pyramid network (FPN) to extract the semantic features of objects, and then the extraction of features are used to optimize the features of FPN by summation. In addition, we design the refined center-ness (RCenter-ness) branch that predicts the position score of each point on the feature map to improve the localization accuracy. The predicted position score is multiplied by the classification score to obtain the final position score that has a stronger correlation with localization accuracy. The final position score is inputted to the subsequent NMS, which improves localization accuracy. The proposed method in this paper is named ReFPN-FCOS. The sufficient experiments on COCO2017 datasets demonstrate the effectiveness of ReFPN-FCOS on improving classification accuracy and localization accuracy. The average precisions of this method achieve 1.1% and 1.3 % higher than those of FCOS, when using ResNet50 and ResNet101 as backbone respectively. Code download link: https://github.com/xjl-le/mmdete

[1]  Xiaoping Li,et al.  IoU-balanced Loss Functions for Single-stage Object Detection , 2019, Pattern Recognit. Lett..

[2]  Ran Tao,et al.  Vehicle Detection of Multi-source Remote Sensing Data Using Active Fine-tuning Network , 2020, ISPRS Journal of Photogrammetry and Remote Sensing.

[3]  Dong Wang,et al.  Sparse-YOLO: Hardware/Software Co-Design of an FPGA Accelerator for YOLOv2 , 2020, IEEE Access.

[4]  Xuewu Zhang,et al.  An Improved Faster R-CNN for High-Speed Railway Dropper Detection , 2020, IEEE Access.

[5]  Yiqing Zhang,et al.  Mask-Refined R-CNN: A Network for Refining Object Details in Instance Segmentation , 2020, Sensors.

[6]  Xinggang Wang,et al.  IoU-aware Single-stage Object Detector for Accurate Localization , 2019, Image Vis. Comput..

[7]  Shiming Xiang,et al.  AugFPN: Improving Multi-Scale Feature Learning for Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Ping Luo,et al.  PolarMask: Single Shot Instance Segmentation With Polar Representation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yuning Jiang,et al.  FoveaBox: Beyound Anchor-Based Object Detection , 2019, IEEE Transactions on Image Processing.

[10]  Jocelyn Chanussot,et al.  Fourier-Based Rotation-Invariant Feature Boosting: An Efficient Framework for Geospatial Object Detection , 2019, IEEE Geoscience and Remote Sensing Letters.

[11]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[13]  Muhammad Ilyas,et al.  Real-Time Fuel Truck Detection Algorithm Based on Deep Convolutional Neural Network , 2020, IEEE Access.

[14]  Jun Miao,et al.  Syncretic-NMS: A Merging Non-Maximum Suppression Algorithm for Instance Segmentation , 2020, IEEE Access.

[15]  Jianhui Wang,et al.  Detection of PCB Surface Defects With Improved Faster-RCNN and Feature Pyramid Network , 2020, IEEE Access.

[16]  Yangyang Li,et al.  Anchor-Free Single Stage Detector in Remote Sensing Images Based on Multiscale Dense Path Aggregation Feature Pyramid Network , 2020, IEEE Access.

[17]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[18]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[19]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Zicheng Liu,et al.  Double-Head RCNN: Rethinking Classification and Localization for Object Detection , 2019 .

[21]  Huajun Feng,et al.  Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[23]  Marios Savvides,et al.  Feature Selective Anchor-Free Module for Single-Shot Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Xingyi Zhou,et al.  Bottom-Up Object Detection by Grouping Extreme and Center Points , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Jocelyn Chanussot,et al.  ORSIm Detector: A Novel Object Detection Framework in Optical Remote Sensing Imagery Using Spatial-Frequency Channel Features , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[26]  Kai Chen,et al.  Hybrid Task Cascade for Instance Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Yu Liu,et al.  Gradient Harmonized Single-stage Detector , 2018, AAAI.

[28]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, International Journal of Computer Vision.

[29]  Yuning Jiang,et al.  Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.

[30]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[31]  Jun Chu,et al.  Object Detection Based on Multi-Layer Convolution Feature Fusion and Online Hard Example Mining , 2018, IEEE Access.

[32]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Bo Wang,et al.  Single-Shot Object Detection with Enriched Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[35]  Shifeng Zhang,et al.  Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Yuning Jiang,et al.  UnitBox: An Advanced Object Detection Network , 2016, ACM Multimedia.

[39]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[40]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[42]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Yi Yang,et al.  DenseBox: Unifying Landmark Localization with End to End Object Detection , 2015, ArXiv.

[44]  Andrew Beng Jin Teoh,et al.  Alignment-free row-co-occurrence cancelable palmprint Fuzzy Vault , 2015, Pattern Recognit..

[45]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[47]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[48]  Ming Li,et al.  Dual-source discrimination power analysis for multi-instance contactless palmprint recognition , 2015, Multimedia Tools and Applications.

[49]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[50]  Lu Leng,et al.  PalmHash Code vs. PalmPhasor Code , 2013, Neurocomputing.