Remote sensing target detection in a harbor area based on an arbitrary-oriented convolutional neural network

Abstract. With the rapid development of optical remote sensing, it is urgent to find a reliable target detection method. Compared with traditional detection algorithms, a convolutional neural network has attracted considerable attention owing to its efficiency and high transitivity. However, different from general images, remote sensing images contain complex background information and dense small targets with changeable directions that make detection very challenging. To solve these problems and provide a stable and high-performance detection method, a rotated saliency fusion object detection (RSD) model based on “you only look once” (YOLO)v4 is established. First, salient image fusion technology is used to magnify target information. Second, the angle variable and rotated non-maximal suppression is introduced to improve the accuracy of rotated object detection by including the detection of dense objects. Third, the network structure is enhanced to improve the performance of small-target detection. Finally, the k-means algorithm and data enhancement are introduced to increase the robustness of the model. Extensive experiments demonstrate the superiority of the proposed model in detection speed and accuracy. The mean average precision of the proposed RSD model reaches 97.32% for the remote sensing images in a harbor area with an average detection speed of 13.41  s  −  1.

[1]  Yu Li,et al.  Deep learning in bioinformatics: Introduction, application, and perspective in the big data era. , 2019, Methods.

[2]  Hui Xu,et al.  Learning rotation-invariant binary codes for efficient object detection from remote sensing images , 2019 .

[3]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[4]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[5]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Yoav Goldberg,et al.  A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[7]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Hong-Yuan Mark Liao,et al.  YOLOv4: Optimal Speed and Accuracy of Object Detection , 2020, ArXiv.

[9]  Wei Li,et al.  R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection , 2017, ArXiv.

[10]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[11]  Yichen Wei,et al.  Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Yiliang Zeng,et al.  One hyperspectral object detection algorithm for solving spectral variability problems of the same object in different conditions , 2019, Journal of Applied Remote Sensing.

[13]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[14]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Junchi Yan,et al.  R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object , 2019, AAAI.

[17]  Adam Van Etten,et al.  You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery , 2018, ArXiv.

[18]  Ali Borji,et al.  Salient Object Detection: A Benchmark , 2015, IEEE Transactions on Image Processing.

[19]  Meng Zhang,et al.  Neural Network Methods for Natural Language Processing , 2017, Computational Linguistics.

[20]  Steven C. H. Hoi,et al.  Face Detection using Deep Learning: An Improved Faster RCNN Approach , 2017, Neurocomputing.

[21]  Jian Sun,et al.  Saliency Optimization from Robust Background Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[22]  Yunhong Wang,et al.  Receptive Field Block Net for Accurate and Fast Object Detection , 2017, ECCV.

[23]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Sabine Süsstrunk,et al.  Frequency-tuned salient region detection , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Junchi Yan,et al.  Arbitrary-Oriented Object Detection with Circular Smooth Label , 2020, ECCV.

[26]  Dong Yu,et al.  Exploring convolutional neural network structures and optimization techniques for speech recognition , 2013, INTERSPEECH.

[27]  Radomír Mech,et al.  Minimum Barrier Salient Object Detection at 80 FPS , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[29]  Gerald Penn,et al.  Convolutional Neural Networks for Speech Recognition , 2014, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[30]  Gang Hua,et al.  A convolutional neural network cascade for face detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Marco Gianinetto,et al.  Object-based image analysis approach for vessel detection on optical and radar images , 2019 .

[33]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[34]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Peter E.D. Love,et al.  Convolutional neural networks: Computer vision-based workforce activity assessment in construction , 2018, Automation in Construction.

[36]  Punam K. Saha,et al.  The minimum barrier distance , 2013, Comput. Vis. Image Underst..

[37]  Yu Li,et al.  Deep learning in bioinformatics: introduction, application, and perspective in big data era , 2019, bioRxiv.

[38]  Xiaoou Tang,et al.  Accelerating the Super-Resolution Convolutional Neural Network , 2016, ECCV.

[39]  Xindong Wu,et al.  Object Detection With Deep Learning: A Review , 2018, IEEE Transactions on Neural Networks and Learning Systems.