What do We Learn by Semantic Scene Understanding for Remote Sensing imagery in CNN framework?

Recently, deep convolutional neural network (DCNN) achieved increasingly remarkable success and rapidly developed in the field of natural image recognition. Compared with the natural image, the scale of remote sensing image is larger and the scene and the object it represents are more macroscopic. This study inquires whether remote sensing scene and natural scene recognitions differ and raises the following questions: What are the key factors in remote sensing scene recognition? Is the DCNN recognition mechanism centered on object recognition still applicable to the scenarios of remote sensing scene understanding? We performed several experiments to explore the influence of the DCNN structure and the scale of remote sensing scene understanding from the perspective of scene complexity. Our experiment shows that understanding a complex scene depends on an in-depth network and multiple-scale perception. Using a visualization method, we qualitatively and quantitatively analyze the recognition mechanism in a complex remote sensing scene and demonstrate the importance of multi-objective joint semantic support.

[1]  Yinda Zhang,et al.  LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop , 2015, ArXiv.

[2]  Zhao-Liang Li,et al.  Scale Issues in Remote Sensing: A Review on Analysis, Processing and Modeling , 2009, Sensors.

[3]  Bin Fang,et al.  Scene classification based on single-layer SAE and SVM , 2015, Expert Syst. Appl..

[4]  Gui-Song Xia,et al.  AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[5]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[6]  Thomas Blaschke,et al.  Object based image analysis for remote sensing , 2010 .

[7]  Roberto Cipolla,et al.  SceneNet: Understanding Real World Indoor Scenes With Synthetic Data , 2015, ArXiv.

[8]  Axel Pinz,et al.  Active fusion for remote sensing image understanding , 1995, Remote Sensing.

[9]  Usman Saeed Eye movements during scene understanding for biometric identification , 2016, Pattern Recognit. Lett..

[10]  Limin Wang,et al.  Knowledge Guided Disambiguation for Large-Scale Scene Classification With Multi-Resolution CNNs , 2016, IEEE Transactions on Image Processing.

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Bolei Zhou,et al.  Learning Deep Features for Discriminative Localization , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Nuno Vasconcelos,et al.  Scene classification with semantic Fisher vectors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ming Cui,et al.  Scene classification based on multifeature probabilistic latent semantic analysis for high spatial resolution remote sensing images , 2015 .

[17]  U. Benz,et al.  Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information , 2004 .

[18]  Tsampikos Kounalakis Depth-adaptive methodologies for 3D image caregorization , 2015 .

[19]  Krista A. Ehinger,et al.  SUN database: Large-scale scene recognition from abbey to zoo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[20]  Guosheng Lin,et al.  Deep convolutional neural fields for depth estimation from a single image , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Qingming Huang,et al.  Relay Backpropagation for Effective Learning of Deep Convolutional Neural Networks , 2015, ECCV.

[22]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Miguel Cazorla,et al.  Scene classification based on semantic labeling , 2016, Adv. Robotics.

[24]  Xueying Qin,et al.  Deep Depth Super-Resolution: Learning Depth Super-Resolution Using Deep Convolutional Neural Network , 2016, ACCV.

[25]  Amy Loutfi,et al.  Classification and Segmentation of Satellite Orthoimagery Using Convolutional Neural Networks , 2016, Remote. Sens..

[26]  Andrew Y. Ng,et al.  Selecting Receptive Fields in Deep Networks , 2011, NIPS.