Convolutional Neural Network Based Automatic Object Detection on Aerial Images

We are witnessing daily acquisition of large amounts of aerial and satellite imagery. Analysis of such large quantities of data can be helpful for many practical applications. In this letter, we present an automatic content-based analysis of aerial imagery in order to detect and mark arbitrary objects or regions in high-resolution images. For that purpose, we proposed a method for automatic object detection based on a convolutional neural network. A novel two-stage approach for network training is implemented and verified in the tasks of aerial image classification and object detection. First, we tested the proposed training approach using UCMerced data set of aerial images and achieved accuracy of approximately 98.6%. Second, the method for automatic object detection was implemented and verified. For implementation on GPGPU, a required processing time for one aerial image of size 5000 × 5000 pixels was around 30 s.

[1]  David Picard,et al.  Evaluation of second-order visual features for land-use classification , 2014, 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI).

[2]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Yuning Jiang,et al.  Randomized Spatial Partition for Scene Recognition , 2012, ECCV.

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Jianxin Wu,et al.  mCENTRIST: A Multi-Channel Feature Generation Mechanism for Scene Categorization , 2014, IEEE Transactions on Image Processing.

[6]  Peter L. Bartlett,et al.  Adaptive Online Gradient Descent , 2007, NIPS.

[7]  Jefersson Alex dos Santos,et al.  Improving Spatial Feature Representation from Aerial Scenes by Using Convolutional Networks , 2015, 2015 28th SIBGRAPI Conference on Graphics, Patterns and Images.

[8]  Sabine Süsstrunk,et al.  Multi-spectral SIFT for scene category recognition , 2011, CVPR 2011.

[9]  Yingli Tian,et al.  Pyramid of Spatial Relatons for Scene-Level Land Use Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[10]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..

[11]  Luisa Verdoliva,et al.  Land Use Classification in Remote Sensing Images by Convolutional Neural Networks , 2015, ArXiv.

[12]  George D. Magoulas,et al.  Learning Rate Adaptation in Stochastic Gradient Descent , 2001 .

[13]  Guillaume Gravier,et al.  Oriented pooling for dense and non-dense rotation-invariant features , 2013, BMVC.

[14]  Shawn D. Newsam,et al.  Bag-of-visual-words and spatial extensions for land-use classification , 2010, GIS '10.

[15]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Anil M. Cheriyadat,et al.  Unsupervised Feature Learning for Aerial Scene Classification , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[20]  Shawn D. Newsam,et al.  Spatial pyramid co-occurrence for image classification , 2011, 2011 International Conference on Computer Vision.

[21]  Brian P. Salmon,et al.  Multiview Deep Learning for Land-Use Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[22]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.