论文信息 - Corner Proposal Network for Anchor-free, Two-stage Object Detection

Corner Proposal Network for Anchor-free, Two-stage Object Detection

The goal of object detection is to determine the class and location of objects in an image. This paper proposes a novel anchor-free, two-stage framework which first extracts a number of object proposals by finding potential corner keypoint combinations and then assigns a class label to each proposal by a standalone classification stage. We demonstrate that these two stages are effective solutions for improving recall and precision, respectively, and they can be integrated into an end-to-end network. Our approach, dubbed Corner Proposal Network (CPN), enjoys the ability to detect objects of various scales and also avoids being confused by a large number of false-positive proposals. On the MS-COCO dataset, CPN achieves an AP of 49.2% which is competitive among state-of-the-art object detection methods. CPN also fits the scenario of computational efficiency, which achieves an AP of 41.6%/39.7% at 26.2/43.3 FPS, surpassing most competitors with the same inference speed. Code is available at this https URL

[1] Kaiming He,et al. Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2] Trevor Darrell,et al. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Zhiqiang Shen,et al. Soft Anchor-Point Object Detection , 2019, ECCV.

[4] Hao Chen,et al. FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[5] Yun Teng,et al. CornerNet-Lite: Efficient Keypoint based Object Detection , 2019, BMVC.

[6] Nuno Vasconcelos,et al. Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[8] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[10] Shifeng Zhang,et al. Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Xingyi Zhou,et al. Objects as Points , 2019, ArXiv.

[12] Zhiao Huang,et al. Associative Embedding: End-to-End Learning for Joint Detection and Grouping , 2016, NIPS.

[13] Kai Chen,et al. Region Proposal by Guided Anchoring , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Quoc V. Le,et al. EfficientDet: Scalable and Efficient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Geoffrey E. Hinton,et al. ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16] Shu Liu,et al. Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18] Yuning Jiang,et al. FoveaBox: Beyound Anchor-Based Object Detection , 2019, IEEE Transactions on Image Processing.

[19] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20] Hong-Yuan Mark Liao,et al. YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[21] Yi Yang,et al. DenseBox: Unifying Landmark Localization with End to End Object Detection , 2015, ArXiv.

[22] Luc Van Gool,et al. The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[23] Marios Savvides,et al. Feature Selective Anchor-Free Module for Single-Shot Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Yi Li,et al. Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[25] Huajun Feng,et al. Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[27] Kai Chen,et al. MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[28] Dong Liu,et al. Deep High-Resolution Representation Learning for Human Pose Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[30] Stephen Lin,et al. RepPoints: Point Set Representation for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[31] Quoc V. Le,et al. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks , 2019, ICML.

[32] Jia Deng,et al. Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[33] Larry S. Davis,et al. Soft-NMS — Improving Object Detection with One Line of Code , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[34] Hei Law,et al. CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[35] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[36] Junjie Yan,et al. Grid R-CNN , 2018, 1811.12030.

[37] Lars Petersson,et al. DeNet: Scalable Real-Time Object Detection with Directed Sparse Sampling , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38] Luca Antiga,et al. Automatic differentiation in PyTorch , 2017 .

[39] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40] Rongrong Ji,et al. FreeAnchor: Learning to Match Anchors for Visual Object Detection , 2019, NeurIPS.

[41] Zhaoxiang Zhang,et al. Scale-Aware Trident Networks for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42] Yuning Jiang,et al. UnitBox: An Advanced Object Detection Network , 2016, ACM Multimedia.

[43] Kilian Q. Weinberger,et al. Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44] Ross B. Girshick,et al. Fast R-CNN , 2015, 1504.08083.

[45] Shifeng Zhang,et al. Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46] Trevor Darrell,et al. Deep Layer Aggregation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47] Qi Tian,et al. CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48] Xingyi Zhou,et al. Bottom-Up Object Detection by Grouping Extreme and Center Points , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[50] Zhaoxiang Zhang,et al. Revisiting Feature Alignment for One-stage Object Detection , 2019, ArXiv.

[51] Ross B. Girshick,et al. Mask R-CNN , 2017, 1703.06870.