Scene semantic classification based on scale invariance convolutional neural networks

Convolutional neural networks (CNNs) has been introduced into remote sensing scene classification, achieving outstanding performance. However, the scale change of objects contained in remote sensing scene image make it difficult to extract feature robust to scale, limiting the further improvement of classification accuracy. In this paper, a scene classification method named Scale Invariance Convolutional Neural Networks (SICNNs) is proposed for remote sensing scene classification. In the proposed method, two images with different scales generated by randomly stretching one image are fed into CNNs simultaneously for training at intervals of several iterations. Then a similarity measure layer was added in SICNN to make the distance of the two feature vectors extracted from the two images as close as possible, leading extracted feature to be robust to scale. Experimental results using two datasets, i.e. the UC Merced dataset, Google dataset of SIRI-WHU, demonstrated the effectiveness of the proposed method.

[1]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[2]  Yanfei Zhong,et al.  A spectral–structural bag-of-features scene classifier for very high spatial resolution remote sensing imagery , 2016 .

[3]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Bei Zhao,et al.  Scene classification via latent Dirichlet allocation using a hybrid generative/discriminative strategy for high spatial resolution remote sensing imagery , 2013 .

[5]  Gui-Song Xia,et al.  Dirichlet-Derived Multiple Topic Scene Classification Model for High Spatial Resolution Remote Sensing Imagery , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Brian P. Salmon,et al.  Multiview Deep Learning for Land-Use Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Bo Du,et al.  Scene Classification via a Gradient Boosting Random Convolutional Network Framework , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[9]  Shawn D. Newsam,et al.  Spatial pyramid co-occurrence for image classification , 2011, 2011 International Conference on Computer Vision.

[10]  Liangpei Zhang,et al.  The Fisher Kernel Coding Framework for High Spatial Resolution Scene Classification , 2016, Remote. Sens..

[11]  Mihai Datcu,et al.  Semantic Annotation of Satellite Images Using Latent Dirichlet Allocation , 2010, IEEE Geoscience and Remote Sensing Letters.

[12]  Hongliang Li,et al.  Semantic Annotation of Satellite Images Using Author–Genre–Topic Model , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Yingli Tian,et al.  Pyramid of Spatial Relatons for Scene-Level Land Use Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[14]  Liangpei Zhang,et al.  Scene Classification Based on the Multifeature Fusion Probabilistic Topic Model for High Spatial Resolution Remote Sensing Imagery , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[15]  Anil M. Cheriyadat,et al.  Unsupervised Feature Learning for Aerial Scene Classification , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[16]  Bo Du,et al.  Saliency-Guided Unsupervised Feature Learning for Scene Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Yanfei Zhong,et al.  Large patch convolutional neural networks for the scene classification of high spatial resolution imagery , 2016 .