Region-based convolutional neural networks for object detection in very high resolution remote sensing images

Recently, the automatic object detection in high-resolution remote sensing images has become the key point in the application of remote sensing technology. The traditional methods, such as bag-of-visual-words (BOVW), could perform well in simple scenes, but when it used in complex scenes, the performance drops quickly. This paper we first try to use the current hot deep learning technology: Region-based convolutional neural networks (R-CNN), to detect aircrafts under the complex environments in high-resolution remote sensing images. This method has been proved to be very efficiency when using in object detection in natural images. Here, we tried to introduce this method into the field of the remote sensing. During our experiments, we also compared the impact of different proposal generate methods on the final detection results. And we also proposed some practical tips to accelerate the detection speed. After detection, we proposed to use a novel algorithm which we called box-fusion, to eliminate the redundant and repetitive boxes that covering the same object. As experiments and results shows, the R-CNN method is much more effective and robust than the traditional BOVW method when dealing with aircrafts detection under complex scenes in high-resolution remote sensing images.

[1]  Yu Li,et al.  Automatic Target Detection in High-Resolution Remote Sensing Images Using Spatial Sparse Coding Bag-of-Words Model , 2012, IEEE Geoscience and Remote Sensing Letters.

[2]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[3]  A. Aydin Alatan,et al.  Efficient graph-based image segmentation via speeded-up turbo pixels , 2010, 2010 IEEE International Conference on Image Processing.

[4]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6]  Lorenzo Bruzzone,et al.  A Multilevel Context-Based System for Classification of Very High Spatial Resolution Images , 2006, IEEE Transactions on Geoscience and Remote Sensing.

[7]  Xian Sun,et al.  Object Detection in High-Resolution Remote Sensing Images Using Rotation Invariant Parts Based Model , 2014, IEEE Geoscience and Remote Sensing Letters.

[8]  Jitendra Malik,et al.  Region-Based Convolutional Networks for Accurate Object Detection and Segmentation , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Shan Suthaharan,et al.  Support Vector Machine , 2016 .

[10]  Philip H. S. Torr,et al.  BING: Binarized normed gradients for objectness estimation at 300fps , 2019, Computational Visual Media.

[11]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[13]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[14]  C. Lawrence Zitnick,et al.  Edge Boxes: Locating Object Proposals from Edges , 2014, ECCV.

[15]  William Stafford Noble,et al.  Support vector machine , 2013 .

[16]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.