Rotation and Scale-Invariant Object Detector for High Resolution Optical Remote Sensing Images

Object detection of high-resolution optical remote sensing images is challenging due to two fundamental problems. One is the huge scale variation of objects in images, e.g., small vehicle and cross-sea bridge. The other one is the objects could take on arbitrary orientations because of the high angle shot. In this paper, we propose a Rotation and Scale-invariant Detector (RS-Det) for remote sensing images to solve the above problem in an unified network. Specifically, RS-Det consists of a deformable convolution module to learn spatial transformation (such as rotation, transition, etc) and a feature pyramid architecture for multi-scale feature representation. These two modules enable a better feature learning of convolutional neural network and boost the performance by 3.6% compared with the baseline method. In DOTA, a large-scale dataset for aerial image object detection, our RS-Det achieves the state-of-the-art accuracy, which verifies our method’s superiority.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Stephen Lin,et al.  Deformable ConvNets V2: More Deformable, Better Results , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[4]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[5]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[7]  Xian Sun,et al.  Object Detection in High-Resolution Remote Sensing Images Using Rotation Invariant Parts Based Model , 2014, IEEE Geoscience and Remote Sensing Letters.

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Larry S. Davis,et al.  Vehicle Detection Using Partial Least Squares , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[15]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[16]  Fan Chen,et al.  Rotation-Invariant Object Detection in Remote Sensing Images Based on Radial-Gradient Angle , 2015, IEEE Geoscience and Remote Sensing Letters.

[17]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).