Multi-Scale Feature Integrated Attention-Based Rotation Network for Object Detection in VHR Aerial Images

Accurate and robust detection of multi-class objects in very high resolution (VHR) aerial images has been playing a significant role in many real-world applications. The traditional detection methods have made remarkable progresses with horizontal bounding boxes (HBBs) due to CNNs. However, HBB detection methods still exhibit limitations including the missed detection and the redundant detection regions, especially for densely-distributed and strip-like objects. Besides, large scale variations and diverse background also bring in many challenges. Aiming to address these problems, an effective region-based object detection framework named Multi-scale Feature Integration Attention Rotation Network (MFIAR-Net) is proposed for aerial images with oriented bounding boxes (OBBs), which promotes the integration of the inherent multi-scale pyramid features to generate a discriminative feature map. Meanwhile, the double-path feature attention network supervised by the mask information of ground truth is introduced to guide the network to focus on object regions and suppress the irrelevant noise. To boost the rotation regression and classification performance, we present a robust Rotation Detection Network, which can generate efficient OBB representation. Extensive experiments and comprehensive evaluations on two publicly available datasets demonstrate the effectiveness of the proposed framework.

[1]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Menglong Yan,et al.  IoU-Adaptive Deformable R-CNN: Make Full Use of IoU for Multi-Class Object Detection in Remote Sensing Imagery , 2019, Remote. Sens..

[4]  Huajun Feng,et al.  Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Junwei Han,et al.  Multi-class geospatial object detection and geographic image classification based on collection of part detectors , 2014 .

[6]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[7]  Peter Reinartz,et al.  Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery , 2018, ACCV.

[8]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yangyang Li,et al.  RADet: Refine Feature Pyramid Network and Multi-Layer Attention Network for Arbitrary-Oriented Object Detection of Remote Sensing Images , 2020, Remote. Sens..

[10]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[11]  Shifeng Zhang,et al.  Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Wei Li,et al.  R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection , 2017, ArXiv.

[13]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[14]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[15]  Jungong Han,et al.  ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Jin Wang,et al.  A Cascaded R-CNN With Multiscale Attention and Imbalanced Samples for Traffic Sign Detection , 2020, IEEE Access.

[17]  Thomas Blaschke,et al.  Object based image analysis for remote sensing , 2010 .

[18]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Jin Wang,et al.  Training Convolutional Neural Networks with Multi-Size Images and Triplet Loss for Remote Sensing Scene Classification , 2020, Sensors.

[20]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Liangpei Zhang,et al.  An Efficient and Robust Integrated Geospatial Object Detection Framework for High Spatial Resolution Remote Sensing Imagery , 2017, Remote. Sens..

[22]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yue Zhang,et al.  SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Xiaodong Zhang,et al.  Geospatial Object Detection on High Resolution Remote Sensing Imagery Based on Double Multi-Scale Feature Pyramid Network , 2019, Remote. Sens..

[25]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[26]  Yiping Yang,et al.  Ship Rotated Bounding Box Space for Ship Extraction From High-Resolution Optical Satellite Images With Complex Backgrounds , 2016, IEEE Geoscience and Remote Sensing Letters.

[27]  Gang Wan,et al.  Object Detection in Optical Remote Sensing Images: A Survey and A New Benchmark , 2020, ISPRS Journal of Photogrammetry and Remote Sensing.

[28]  Gui-Song Xia,et al.  Learning RoI Transformer for Detecting Oriented Objects in Aerial Images , 2018, ArXiv.

[29]  Zhihai Xu,et al.  $\mathcal{R}^2$ -CNN: Fast Tiny Object Detection in Large-Scale Remote Sensing Images , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[30]  Р Ю Чуйков,et al.  Обнаружение транспортных средств на изображениях загородных шоссе на основе метода Single shot multibox Detector , 2017 .

[31]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[32]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[33]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Xin Huang,et al.  Deep networks under scene-level supervision for multi-class geospatial object detection from remote sensing images , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[35]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[36]  Wei Guo,et al.  Geospatial Object Detection in High Resolution Satellite Images Based on Multi-Scale Convolutional Neural Network , 2018, Remote. Sens..

[37]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Junwei Han,et al.  A Survey on Object Detection in Optical Remote Sensing Images , 2016, ArXiv.

[39]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[40]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Menglong Yan,et al.  Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks , 2018, Remote. Sens..

[42]  Xiangyang Xue,et al.  Arbitrary-Oriented Scene Text Detection via Rotation Proposals , 2017, IEEE Transactions on Multimedia.

[43]  Jin Wang,et al.  A full convolutional network based on DenseNet for remote sensing scene classification. , 2019, Mathematical biosciences and engineering : MBE.

[44]  Lin Lei,et al.  Arbitrary-Oriented Vehicle Detection in Aerial Imagery with Single Convolutional Neural Networks , 2017, Remote. Sens..

[45]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Zhaoxiang Zhang,et al.  Scale-Aware Trident Networks for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).