Anchor-Free Single Stage Detector in Remote Sensing Images Based on Multiscale Dense Path Aggregation Feature Pyramid Network

Object detection has always been a challenging task in the field of computer vision due to complex background, large scale variation and many small objects, which are especially pronounced for remote sensing imagery. In recent years, object detection in remote sensing with the development of deep learning has also made great breakthroughs. At present, almost all state-of-the-art object detectors rely on pre-defined anchor boxes for remote sensing imagery. However, too many anchor boxes will introduce a large number of hyper-parameters, which not only increase the memory footprint, but also increase the computational redundancy of the detection model. In contrast, we propose an anchor-free single-stage detector for remote sensing imagery object detection, avoiding a large number of hyper-parameters related to the anchor box, which usually affect the performance of the detection model. Specially, considering the large-scale differences in the objects and the characteristics of small objects in remote sensing imagery, we design a dense path aggregation feature pyramid network (DPAFPN), which can make full use of the high-level semantic information and low-level location information in remote sensing imagery, and to a certain extent, avoid information loss during shallow feature transfer. In our experiments, extensive experiments on two public remote sensing datasets DOTA, NWPU VHR-10 were conducted. The experimental results demonstrate that our detector has good performance and is meaningful for object detection in remote sensing imagery.

[1]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Mengyang Li,et al.  Single Shot Anchor Refinement Network for Oriented Object Detection in Optical Remote Sensing Imagery , 2019, IEEE Access.

[3]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[8]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[9]  Tong Yang,et al.  MetaAnchor: Learning to Detect Objects with Customized Anchors , 2018, NeurIPS.

[10]  Junwei Han,et al.  Efficient, simultaneous detection of multi-class geospatial targets based on visual saliency modeling and discriminative learning of sparse coding , 2014 .

[11]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[12]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Abhinav Gupta,et al.  Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Lei Zhang,et al.  Anchor Box Optimization for Object Detection , 2018, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[18]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Menglong Yan,et al.  Position Detection and Direction Prediction for Arbitrary-Oriented Ships via Multitask Rotation Region Convolutional Neural Network , 2018, IEEE Access.

[20]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[21]  Zhiao Huang,et al.  Associative Embedding: End-to-End Learning for Joint Detection and Grouping , 2016, NIPS.

[22]  Jürgen Beyerer,et al.  Comprehensive Evaluation of Deep Learning based Detection Methods for Vehicle Detection in Aerial Imagery , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[23]  Kai Chen,et al.  Region Proposal by Guided Anchoring , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Yuning Jiang,et al.  Acquisition of Localization Confidence for Accurate Object Detection , 2018, ECCV.

[25]  Anis Koubaa,et al.  Car Detection using Unmanned Aerial Vehicles: Comparison between Faster R-CNN and YOLOv3 , 2018, 2019 1st International Conference on Unmanned Vehicle Systems-Oman (UVS).

[26]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[27]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[29]  Larry S. Davis,et al.  AutoFocus: Efficient Multi-Scale Inference , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[31]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[32]  Yi Yang,et al.  DenseBox: Unifying Landmark Localization with End to End Object Detection , 2015, ArXiv.

[33]  Matti Pietikäinen,et al.  Deep Learning for Generic Object Detection: A Survey , 2018, International Journal of Computer Vision.