Extended Feature Pyramid Network with Adaptive Scale Training Strategy and Anchors for Object Detection in Aerial Images

Multi-scale object detection is a basic challenge in computer vision. Although many advanced methods based on convolutional neural networks have succeeded in natural images, the progress in aerial images has been relatively slow mainly due to the considerably huge scale variations of objects and many densely distributed small objects. In this paper, considering that the semantic information of the small objects may be weakened or even disappear in the deeper layers of neural network, we propose a new detection framework called Extended Feature Pyramid Network (EFPN) for strengthening the information extraction ability of the neural network. In the EFPN, we first design the multi-branched dilated bottleneck (MBDB) module in the lateral connections to capture much more semantic information. Then, we further devise an attention pathway for better locating the objects. Finally, an augmented bottom-up pathway is conducted for making shallow layer information easier to spread and further improving performance. Moreover, we present an adaptive scale training strategy to enable the network to better recognize multi-scale objects. Meanwhile, we present a novel clustering method to achieve adaptive anchors and make the neural network better learn data features. Experiments on the public aerial datasets indicate that the presented method obtain state-of-the-art performance.

[1]  King Ngi Ngan,et al.  A2RMNet: Adaptively Aspect Ratio Multi-Scale Network for Object Detection in Remote Sensing Images , 2019, Remote. Sens..

[2]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Kil To Chong,et al.  Object Detection in Very High-Resolution Aerial Images Using One-Stage Densely Connected Feature Pyramid Network , 2018, Sensors.

[4]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Menglong Yan,et al.  IoU-Adaptive Deformable R-CNN: Make Full Use of IoU for Multi-Class Object Detection in Remote Sensing Imagery , 2019, Remote. Sens..

[7]  Weiguo Gong,et al.  Web-Net: A Novel Nest Networks with Ultra-Hierarchical Sampling for Building Extraction from Aerial Imageries , 2019, Remote. Sens..

[8]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  B D Satoto,et al.  Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster , 2018, IOP Conference Series: Materials Science and Engineering.

[10]  Weihong Li,et al.  Object Detection in Remote Sensing Images Based on a Scene-Contextual Feature Pyramid Network , 2019, Remote. Sens..

[11]  Qing Liu,et al.  Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[12]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[13]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[14]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.