AMR-Net: Arbitrary-Oriented Ship Detection Using Attention Module, Multi-Scale Feature Fusion and Rotation Pseudo-Label

Ship detection is significant and full of challenges in the field of remote sensing. The widely adopted horizontal bounding box representation is not appropriate for ubiquitous oriented ship objects. Complex backgrounds, small objects, and the dilemma of labeling ship datasets are all obstacles that further limit the successful operation of traditional methods in ship detection. In this paper, we propose a multi-task rotation detector using attention module, multi-scale feature fusion and rotation pseudo-label, namely AMR-Net. AMR-Net adds Deformable Convolution Channel Attention Block (DCCAB) to suppress background noise and highlight the foreground. Feature Pyramid Network (FPN) fuses features from different scales, which is beneficial for ship detection. We present Adaptive FPN (AFPN) to determine the optimal number of prediction layers automatically, which can reduce the disturbance of high-level detection of small objects, decrease the size of the model, and adapt to different ship datasets. To enable the detector to achieve excellent performance even when using fewer labeled data, a semi-supervised pseudo-label module is designed, namely Self-Learning Rotation Pseudo-Label (SRP). SRP allows the detector to iteratively self-learn the optimal thresholds, and use the thresholds to refine high-quality rotation pseudo-labels for retraining the model. SRP is optional. If this module is selected, the detector becomes a semi-supervised detector. Extensive both supervised and semi-supervised experiments on remote sensing public datasets HRSC2016 show the state-of-the-art performance of our detector. Experiments on DOTA further illustrate the effectiveness of AMR-Net.

[1]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[3]  Gui-Song Xia,et al.  Align Deep Features for Oriented Object Detection , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[5]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[6]  Lianwen Jin,et al.  Omnidirectional Scene Text Detection with Sequential-free Box Discretization , 2019, IJCAI.

[7]  Bo Liu,et al.  Oriented Object Detection in Aerial Images with Box Boundary-Aware Vectors , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[8]  Feng Yang,et al.  Multi-Scale Feature Integrated Attention-Based Rotation Network for Object Detection in VHR Aerial Images , 2020, Sensors.

[9]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[10]  Yang Long,et al.  Learning RoI Transformer for Oriented Object Detection in Aerial Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Wei Li,et al.  R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection , 2017, ArXiv.

[12]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[13]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Yiping Yang,et al.  A High Resolution Optical Satellite Image Dataset for Ship Recognition and Some New Baselines , 2017, ICPRAM.

[16]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Junchi Yan,et al.  R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object , 2019, AAAI.

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Joohee Kim,et al.  Object Detection With Location-Aware Deformable Convolution and Backward Attention Filtering , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Gellért Máttyus,et al.  Fast Multiclass Vehicle Detection on Aerial Images , 2015, IEEE Geoscience and Remote Sensing Letters.

[23]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[24]  Yue Zhang,et al.  SARD: Towards Scale-Aware Rotated Object Detection in Aerial Imagery , 2019, IEEE Access.

[25]  Yang Zou,et al.  Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training , 2018, ArXiv.

[26]  Graham W. Taylor,et al.  Improved Regularization of Convolutional Neural Networks with Cutout , 2017, ArXiv.

[27]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Wenxian Yu,et al.  Toward Arbitrary-Oriented Ship Detection With Rotated Region Proposal and Discrimination Networks , 2018, IEEE Geoscience and Remote Sensing Letters.

[29]  Dong-Hyun Lee,et al.  Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks , 2013 .

[30]  Enhua Wu,et al.  Squeeze-and-Excitation Networks , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[32]  Weiming Dong,et al.  Dynamic Refinement Network for Oriented and Densely Packed Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Xiang Bai,et al.  TextBoxes++: A Single-Shot Oriented Scene Text Detector , 2018, IEEE Transactions on Image Processing.

[34]  Kai Chen,et al.  Gliding vertex on the horizontal bounding box for multi-oriented object detection , 2020, IEEE transactions on pattern analysis and machine intelligence.

[35]  Shuchang Zhou,et al.  EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Yue Zhang,et al.  SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[37]  Stephen Lin,et al.  Deformable ConvNets V2: More Deformable, Better Results , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Shifeng Zhang,et al.  Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Bo Zhong,et al.  Single-Stage Rotation-Decoupled Detector for Oriented Object , 2020, Remote. Sens..

[40]  Han Zhang,et al.  A Simple Semi-Supervised Learning Framework for Object Detection , 2020, ArXiv.

[41]  Gui-Song Xia,et al.  Rotation-Sensitive Regression for Oriented Scene Text Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[42]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[43]  Yiping Yang,et al.  Ship Rotated Bounding Box Space for Ship Extraction From High-Resolution Optical Satellite Images With Complex Backgrounds , 2016, IEEE Geoscience and Remote Sensing Letters.

[44]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[45]  Lei Liu,et al.  Learning a Rotation Invariant Detector with Rotatable Bounding Box , 2017, ArXiv.