Object Detection based on OcSaFPN in Aerial Images with Noise

Taking the deep learning-based algorithms into account has become a crucial way to boost object detection performance in aerial images. While various neural network representations have been developed, previous works are still inefficient to investigate the noise-resilient performance, especially on aerial images with noise taken by the cameras with telephoto lenses, and most of the research is concentrated in the field of denoising. Of course, denoising usually requires an additional computational burden to obtain higher quality images, while noise-resilient is more of a description of the robustness of the network itself to different noises, which is an attribute of the algorithm itself. For this reason, the work will be started by analyzing the noise-resilient performance of the neural network, and then propose two hypotheses to build a noiseresilient structure. Based on these hypotheses, we compare the noise-resilient ability of the Oct-ResNet with frequency division processing and the commonly used ResNet. In addition, previous feature pyramid networks used for aerial object detection tasks are not specifically designed for the frequency division feature maps of the Oct-ResNet, and they usually lack attention to bridging the semantic gap between diverse feature maps from different depths. On the basis of this, a novel octave convolutionbased semantic attention feature pyramid network (OcSaFPN) is proposed to get higher accuracy in object detection with noise. The proposed algorithm tested on three datasets demonstrates that the proposed OcSaFPN achieves a state-of-the-art detection performance with Gaussian noise or multiplicative noise. In addition, more experiments have proved that the OcSaFPN structure can be easily added to existing algorithms, and the noise-resilient ability can be effectively improved.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jianyuan Guo,et al.  GhostNet: More Features From Cheap Operations , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Junyu Dong,et al.  Sea Ice Change Detection in SAR Images Based on Convolutional-Wavelet Neural Networks , 2019, IEEE Geoscience and Remote Sensing Letters.

[4]  Dazhuan Xu,et al.  Sig-NMS-Based Faster R-CNN Combining Transfer Learning for Small Target Detection in VHR Optical Remote Sensing Imagery , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Ke Li,et al.  Rotation-Insensitive and Context-Augmented Object Detection in Remote Sensing Images , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[9]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Bingbing Ni,et al.  Scale-Transferrable Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Yang Long,et al.  Airport Detection Based on a Multiscale Fusion Feature for Optical Remote Sensing Images , 2017, IEEE Geoscience and Remote Sensing Letters.

[13]  Yoshua Bengio,et al.  On the Spectral Bias of Neural Networks , 2018, ICML.

[14]  Robert Li,et al.  Wavelet Pooling for Convolutional Neural Networks , 2018, ICLR.

[15]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Gang Qiao,et al.  Applications of Historical Optical DISP Images in Antarctica Study , 2019, IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium.

[17]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Hajime Asama,et al.  Octave Deep Plane-Sweeping Network: Reducing Spatial Redundancy for Learning-Based Plane-Sweeping Stereo , 2019, IEEE Access.

[19]  Stella X. Yu,et al.  Multigrid Neural Architectures , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Lei Zhang,et al.  Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising , 2016, IEEE Transactions on Image Processing.

[21]  I. Laptev,et al.  Towards reliable object detection in noisy images , 2017, Pattern Recognition and Image Analysis.

[22]  In-So Kweon,et al.  CBAM: Convolutional Block Attention Module , 2018, ECCV.

[23]  Fuchun Sun,et al.  Deep Feature Pyramid Reconfiguration for Object Detection , 2018, ECCV.

[24]  Shuicheng Yan,et al.  Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks With Octave Convolution , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[25]  Wangmeng Zuo,et al.  Deep Learning on Image Denoising: An overview , 2019, Neural Networks.

[26]  Xiaonan Luo,et al.  Learning a Wavelet-like Auto-Encoder to Accelerate Deep Neural Networks , 2017, AAAI.

[27]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[28]  Xin Xu,et al.  Deformable ConvNet with Aspect Ratio Constrained NMS for Object Detection in Remote Sensing Imagery , 2017, Remote. Sens..

[29]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[30]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[31]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[32]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[33]  Shiming Xiang,et al.  AugFPN: Improving Multi-Scale Feature Learning for Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Jaakko Lehtinen,et al.  Noise2Noise: Learning Image Restoration without Clean Data , 2018, ICML.

[35]  Shijian Lu,et al.  CAD-Net: A Context-Aware Detection Network for Objects in Remote Sensing Imagery , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[36]  Fang Liu,et al.  SAR Image segmentation based on convolutional-wavelet neural network and markov random field , 2017, Pattern Recognit..

[37]  Jing Zhang,et al.  Object Detection Based on Global-Local Saliency Constraint in Aerial Images , 2020, Remote. Sens..

[38]  Yong Xiao,et al.  CSA-MSO3DCNN: Multiscale Octave 3D CNN with Channel and Spatial Attention for Hyperspectral Image Classification , 2020, Remote. Sens..

[39]  Wangmeng Zuo,et al.  Learning Deep CNN Denoiser Prior for Image Restoration , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Taesup Moon,et al.  Fully Convolutional Pixel Adaptive Image Denoiser , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[41]  Adam Van Etten,et al.  You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery , 2018, ArXiv.

[42]  Takeshi Takaki,et al.  Super-telephoto Drone Tracking Using HFR-video-based Vibration Source Localization , 2019, 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[43]  Yue Zhang,et al.  SCRDet: Towards More Robust Detection for Small, Cluttered and Rotated Objects , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[44]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  M. Fowler,et al.  Declassified Intelligence Satellite Photographs , 2013 .

[46]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[47]  Xiangyu Zhang,et al.  Light-Head R-CNN: In Defense of Two-Stage Object Detector , 2017, ArXiv.

[48]  Zhi-Qin John Xu,et al.  Training behavior of deep neural network in frequency domain , 2018, ICONIP.

[49]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[50]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[51]  Junchi Yan,et al.  R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object , 2019, AAAI.

[52]  Huanxin Zou,et al.  Toward Fast and Accurate Vehicle Detection in Aerial Images Using Coupled Region-Based Convolutional Neural Networks , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[53]  Zhihai Xu,et al.  Spatial-Adaptive Network for Single Image Denoising , 2020, ECCV.

[54]  Yiqun Liu,et al.  Practical Deep Raw Image Denoising on Mobile Devices , 2020, ECCV.

[55]  Ali N. Akansu Multiplierless PR quadrature mirror filters for subband image coding , 1996, IEEE Trans. Image Process..

[56]  Volker C. Radeloff,et al.  Half a century of forest cover change along the Latvian-Russian border captured by object-based image analysis of Corona and Landsat TM/OLI data , 2020 .

[57]  Zhenwei Shi,et al.  Random Access Memories: A New Paradigm for Target Detection in High Resolution Aerial Remote Sensing Images , 2018, IEEE Transactions on Image Processing.

[58]  Jesse Casana,et al.  Global-Scale Archaeological Prospection using CORONA Satellite Imagery: Automated, Crowd-Sourced, and Expert-led Approaches , 2020, Journal of Field Archaeology.

[59]  Janis Keuper,et al.  Stabilizing GANs with Octave Convolutions , 2019, ArXiv.

[60]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[61]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[62]  Jianwei Li,et al.  Ship detection in SAR images based on an improved faster R-CNN , 2017, 2017 SAR in Big Data Era: Models, Methods and Applications (BIGSARDATA).

[63]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[64]  Bo Chen,et al.  MnasFPN: Learning Latency-Aware Pyramid Architecture for Object Detection on Mobile Devices , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Wei Li,et al.  R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection , 2017, ArXiv.

[66]  Shunping Xiao,et al.  Deformable Faster R-CNN with Aggregating Multi-Layer Features for Partially Occluded Object Detection in Optical Remote Sensing Images , 2018, Remote. Sens..

[67]  Petia Radeva,et al.  Multiple Wavelet Pooling for CNNs , 2018, ECCV Workshops.

[68]  Songtao Liu,et al.  Learning Spatial Fusion for Single-Shot Object Detection , 2019, ArXiv.

[69]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[70]  Ruigang Yang,et al.  Calibrating Pan-Tilt Cameras with Telephoto Lenses , 2007, ACCV.

[71]  Chenjie Wang,et al.  U2-ONet: A Two-level Nested Octave U-structure with Multiscale Attention Mechanism for Moving Instances Segmentation , 2020, ArXiv.

[72]  Byron L. D. Bezerra,et al.  Lightweight and efficient octave convolutional neural network for fire recognition , 2019, 2019 IEEE Latin American Conference on Computational Intelligence (LA-CCI).

[73]  E. A. Tsyganok,et al.  Research of the long-focus Maksutov telephoto lens , 2016, Photonics Europe.

[74]  Shuicheng Yan,et al.  Highly Efficient Salient Object Detection with 100K Parameters , 2020, ECCV.

[75]  Zhenwei Shi,et al.  Ship Detection in Spaceborne Optical Image With SVD Networks , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[76]  Nick Barnes,et al.  Real Image Denoising With Feature Attention , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[77]  Xiangyang Xue,et al.  Arbitrary-Oriented Scene Text Detection via Rotation Proposals , 2017, IEEE Transactions on Multimedia.

[78]  Quoc V. Le,et al.  EfficientDet: Scalable and Efficient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[79]  Licheng Jiao,et al.  Hyperspectral Image Classification Based on 3-D Octave Convolution With Spatial–Spectral Attention Network , 2021, IEEE Transactions on Geoscience and Remote Sensing.

[80]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[81]  Kun Fu,et al.  FMSSD: Feature-Merged Single-Shot Detection for Multiscale Objects in Large-Scale Remote Sensing Imagery , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[82]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[83]  K. Goda,et al.  Dispersive Fourier transformation for fast continuous single-shot measurements , 2013, Nature Photonics.

[84]  Yang Long,et al.  Learning RoI Transformer for Oriented Object Detection in Aerial Images , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).