An Improved Rotation Invariant CNN-Based Detector with Rotatable Bounding Boxes for Aerial Image Detection

Methods based on Convolutional neural network (CNN) have been effectively applied to object detection. However, the Traditional object detection models face huge obstacles in the field of the aerial scene due to the complexity of the aerial image and the wide degree of freedom of the remote sensing object in density, scale, and orientation. In this article, we applied rotatable bounding boxes (RBox) instead of the traditional bounding box (BBox) and improved the CNN-based network to effectively handle the oriented angle of the detected object so that the rotation invariant property can be achieved. Our model is trained and tested on 3 classes of remote sensing category Airplanes, ships, and Cars. Compared with two of the common benchmarks of the BBox methods Faster R-CNN and SDD, our model outperforms both of them. Where our model efficiently output the orientation angles of the detected objects.

[1]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Xin Xu,et al.  Deformable ConvNet with Aspect Ratio Constrained NMS for Object Detection in Remote Sensing Imagery , 2017, Remote. Sens..

[4]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[5]  Shunping Xiao,et al.  Deformable Faster R-CNN with Aggregating Multi-Layer Features for Partially Occluded Object Detection in Optical Remote Sensing Images , 2018, Remote. Sens..

[6]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[7]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[10]  Lei Liu,et al.  Learning a Rotation Invariant Detector with Rotatable Bounding Box , 2017, ArXiv.

[11]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[15]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[16]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[18]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Menglong Yan,et al.  Automatic Ship Detection in Remote Sensing Images from Google Earth of Complex Scenes Based on Multiscale Rotation Dense Feature Pyramid Networks , 2018, Remote. Sens..