Scene Classification Based on a Deep Random-Scale Stretched Convolutional Neural Network

With the large number of high-resolution images now being acquired, high spatial resolution (HSR) remote sensing imagery scene classification has drawn great attention but is still a challenging task due to the complex arrangements of the ground objects in HSR imagery, which leads to the semantic gap between low-level features and high-level semantic concepts. As a feature representation method for automatically learning essential features from image data, convolutional neural networks (CNNs) have been introduced for HSR remote sensing image scene classification due to their excellent performance in natural image classification. However, some scene classes of remote sensing images are object-centered, i.e., the scene class of an image is decided by the objects it contains. Although previous methods based on CNNs have achieved comparatively high classification accuracies compared with the traditional methods with handcrafted features, they do not consider the scale variation of the objects in the scenes. This makes it difficult to directly utilize CNNs on those remote sensing images belonging to object-centered classes to extract features that are robust to scale variation, leading to wrongly classified scene images. To solve this problem, scene classification based on a deep random-scale stretched convolutional neural network (SRSCNN) for HSR remote sensing imagery is proposed in this paper. In the proposed method, patches with a random scale are cropped from the image and stretched to the specified scale as the input to train the CNN. This forces the CNN to extract features that are robust to the scale variation. Furthermore, to further improve the performance of the CNN, a robust scene classification strategy is adopted, i.e., multi-perspective fusion. The experimental results obtained using three datasets—the UC Merced dataset, the Google dataset of SIRI-WHU, and the Wuhan IKONOS dataset—confirm that the proposed method performs better than the traditional scene classification methods.

[1]  Xiaogang Wang,et al.  Deep Learning Face Representation from Predicting 10,000 Classes , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Jian Sun,et al.  Identity Mappings in Deep Residual Networks , 2016, ECCV.

[3]  Shawn D. Newsam,et al.  Spatial pyramid co-occurrence for image classification , 2011, 2011 International Conference on Computer Vision.

[4]  Andrew L. Maas Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[5]  Liangpei Zhang,et al.  Scene Classification Based on the Sparse Homogeneous–Heterogeneous Topic Feature Model , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Gui-Song Xia,et al.  Dirichlet-Derived Multiple Topic Scene Classification Model for High Spatial Resolution Remote Sensing Imagery , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[8]  Brian P. Salmon,et al.  Multiview Deep Learning for Land-Use Classification , 2015, IEEE Geoscience and Remote Sensing Letters.

[9]  Gang Wang,et al.  Deep Learning-Based Classification of Hyperspectral Data , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[10]  Carlo Gatta,et al.  Unsupervised Deep Feature Extraction for Remote Sensing Image Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[11]  Yanfei Zhong,et al.  A spectral–structural bag-of-features scene classifier for very high spatial resolution remote sensing imagery , 2016 .

[12]  Bo Du,et al.  Saliency-Guided Unsupervised Feature Learning for Scene Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Xiaona Song,et al.  The AdaBoost algorithm for vehicle detection based on CNN features , 2015, ICIMCS '15.

[14]  Peter Kontschieder,et al.  Deep Neural Decision Forests , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Yanfei Zhong,et al.  Large patch convolutional neural networks for the scene classification of high spatial resolution imagery , 2016 .

[17]  Steven Verstockt,et al.  Hyperspectral Image Classification with Convolutional Neural Networks , 2015, ACM Multimedia.

[18]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[19]  Yoshua Bengio,et al.  Deep Sparse Rectifier Neural Networks , 2011, AISTATS.

[20]  Yanfei Liu,et al.  SatCNN: satellite image dataset classification using agile convolutional neural networks , 2017 .

[21]  Mihai Datcu,et al.  Semantic Annotation of Satellite Images Using Latent Dirichlet Allocation , 2010, IEEE Geoscience and Remote Sensing Letters.

[22]  Ping Tang,et al.  A 2-D wavelet decomposition-based bag-of-visual-words model for land-use scene classification , 2014 .

[23]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Bei Zhao,et al.  Scene classification via latent Dirichlet allocation using a hybrid generative/discriminative strategy for high spatial resolution remote sensing imagery , 2013 .

[25]  Ping Tang,et al.  Land-Use Scene Classification Using a Concentric Circle-Structured Multiscale Bag-of-Visual-Words Model , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[26]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[27]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[28]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[29]  Lei Guo,et al.  Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[30]  Xiaogang Wang,et al.  DeepID3: Face Recognition with Very Deep Neural Networks , 2015, ArXiv.

[31]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Yingli Tian,et al.  Pyramid of Spatial Relatons for Scene-Level Land Use Classification , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[33]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..

[34]  Zexuan Zhu,et al.  Computational intelligence in optical remote sensing image processing , 2018, Appl. Soft Comput..

[35]  Liangpei Zhang,et al.  Scene Classification Based on the Multifeature Fusion Probabilistic Topic Model for High Spatial Resolution Remote Sensing Imagery , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[36]  Anil M. Cheriyadat,et al.  Unsupervised Feature Learning for Aerial Scene Classification , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[37]  Geoffrey E. Hinton,et al.  On rectified linear units for speech processing , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[38]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[39]  Jie Geng,et al.  Spectral–Spatial Classification of Hyperspectral Image Based on Deep Auto-Encoder , 2016, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[40]  Liangpei Zhang,et al.  Scene Classification Based on the Fully Sparse Semantic Topic Model , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[41]  Hongliang Li,et al.  Semantic Annotation of Satellite Images Using Author–Genre–Topic Model , 2014, IEEE Transactions on Geoscience and Remote Sensing.

[42]  Bo Du,et al.  Scene Classification via a Gradient Boosting Random Convolutional Network Framework , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[43]  Hong Sun,et al.  Unsupervised Feature Learning Via Spectral Clustering of Multidimensional Patches for Remotely Sensed Scene Classification , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[44]  Nitish Srivastava,et al.  Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.

[45]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Bei Zhao,et al.  Scene Semantic Understanding Based on the Spatial Context Relations of Multiple Objects , 2017, Remote. Sens..

[47]  Izhar Wallach,et al.  AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery , 2015, ArXiv.

[48]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[49]  Arnt-Borre Salberg,et al.  Detection of seals in remote sensing images using features extracted from deep convolutional neural networks , 2015, 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[50]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[51]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Luisa Verdoliva,et al.  Land Use Classification in Remote Sensing Images by Convolutional Neural Networks , 2015, ArXiv.

[53]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[54]  Robert Marti,et al.  Which is the best way to organize/classify images by content? , 2007, Image Vis. Comput..

[55]  Liangpei Zhang,et al.  The Fisher Kernel Coding Framework for High Spatial Resolution Scene Classification , 2016, Remote. Sens..

[56]  Tong Zhang,et al.  Deep Learning Based Feature Selection for Remote Sensing Scene Classification , 2015, IEEE Geoscience and Remote Sensing Letters.