Multi-Scale Remote Sensing Semantic Analysis Based on a Global Perspective

Remote sensing image captioning involves remote sensing objects and their spatial relationships. However, it is still difficult to determine the spatial extent of a remote sensing object and the size of a sample patch. If the patch size is too large, it will include too many remote sensing objects and their complex spatial relationships. This will increase the computational burden of the image captioning network and reduce its precision. If the patch size is too small, it often fails to provide enough environmental and contextual information, which makes the remote sensing object difficult to describe. To address this problem, we propose a multi-scale semantic long short-term memory network (MS-LSTM). The remote sensing images are paired into image patches with different spatial scales. First, the large-scale patches have larger sizes. We use a Visual Geometry Group (VGG) network to extract the features from the large-scale patches and input them into the improved MS-LSTM network as the semantic information, which provides a larger receptive field and more contextual semantic information for small-scale image caption so as to play the role of global perspective, thereby enabling the accurate identification of small-scale samples with the same features. Second, a small-scale patch is used to highlight remote sensing objects and simplify their spatial relations. In addition, the multi-receptive field provides perspectives from local to global. The experimental results demonstrated that compared with the original long short-term memory network (LSTM), the MS-LSTM’s Bilingual Evaluation Understudy (BLEU) has been increased by 5.6% to 0.859, thereby reflecting that the MS-LSTM has a more comprehensive receptive field, which provides more abundant semantic information and enhances the remote sensing image captions.

[1]  Cheng Wu,et al.  Semi-Supervised and Unsupervised Extreme Learning Machines , 2014, IEEE Transactions on Cybernetics.

[2]  Xuelong Li,et al.  Multi-Scale Cropping Mechanism for Remote Sensing Image Captioning , 2019, IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium.

[3]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[4]  Emile Ndikumana,et al.  Deep Recurrent Neural Network for Agricultural Classification using multitemporal SAR Sentinel-1 for Camargue, France , 2018, Remote. Sens..

[5]  Fan Zhang,et al.  Deep Convolutional Neural Networks for Hyperspectral Image Classification , 2015, J. Sensors.

[6]  Zhenwei Shi,et al.  Can a Machine Generate Humanlike Language Descriptions for a Remote Sensing Image? , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[7]  W. Tobler A Computer Movie Simulating Urban Growth in the Detroit Region , 1970 .

[8]  Tat-Seng Chua,et al.  SCA-CNN: Spatial and Channel-Wise Attention in Convolutional Networks for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Q. M. Jonathan Wu,et al.  Human face recognition based on multidimensional PCA and extreme learning machine , 2011, Pattern Recognit..

[10]  Pengqiang Zhang,et al.  Spectral-spatial classification of hyperspectral imagery based on recurrent neural networks , 2018, Remote Sensing Letters.

[11]  Lei Guo,et al.  Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[12]  Bei Zhao,et al.  Scene classification based on a hierarchical convolutional sparse auto-encoder for high spatial resolution imagery , 2017 .

[13]  Anthony M. Filippi,et al.  Hyperspectral Image Classification Using Similarity Measurements-Based Deep Recurrent Neural Networks , 2019, Remote. Sens..

[14]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[15]  Fei-Fei Li,et al.  Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[16]  Guang-Bin Huang,et al.  Extreme learning machine: a new learning scheme of feedforward neural networks , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[17]  Chen Chen,et al.  Spectral-Spatial Classification of Hyperspectral Image Based on Kernel Extreme Learning Machine , 2014, Remote. Sens..

[18]  Arno Schäpe,et al.  Multiresolution Segmentation : an optimization approach for high quality multi-scale image segmentation , 2000 .

[19]  Xiangtao Zheng,et al.  Semantic Descriptions of High-Resolution Remote Sensing Images , 2019, IEEE Geoscience and Remote Sensing Letters.

[20]  Min Deng,et al.  Geospatial relation captioning for high-spatial-resolution images by using an attention-based neural network , 2019, International Journal of Remote Sensing.

[21]  Richard Socher,et al.  Knowing When to Look: Adaptive Attention via a Visual Sentinel for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Yoshua Bengio,et al.  Show, Attend and Tell: Neural Image Caption Generation with Visual Attention , 2015, ICML.

[23]  Xin Wang,et al.  Description Generation for Remote Sensing Images Using Attribute Attention Mechanism , 2019, Remote. Sens..

[24]  Xiao Xiang Zhu,et al.  Deep Recurrent Neural Networks for Hyperspectral Image Classification , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Gui-Song Xia,et al.  Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery , 2015, Remote. Sens..

[26]  Qingshan Liu,et al.  Bidirectional-Convolutional LSTM Based Spectral-Spatial Feature Learning for Hyperspectral Image Classification , 2017, Remote. Sens..

[27]  Hao Wu,et al.  Convolutional Recurrent Neural Networks forHyperspectral Data Classification , 2017, Remote. Sens..

[28]  Jie Bao,et al.  Hierarchical Multi-Scale Convolutional Neural Networks for Hyperspectral Image Classification , 2019, Sensors.

[29]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[30]  Jie Geng,et al.  SAR Image Classification via Deep Recurrent Encoding Neural Networks , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[31]  Gang Wang,et al.  A Single Shot Framework with Multi-Scale Feature Fusion for Geospatial Object Detection , 2019, Remote. Sens..

[32]  Wei Li,et al.  3-D Convolution-Recurrent Networks for Spectral-Spatial Classification of Hyperspectral Images , 2019, Remote. Sens..

[33]  Li Xiao-wen The First Law of Geography and Spatial-Temporal Proximity , 2007 .

[34]  Garrison W. Cottrell,et al.  Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Yuhao Wang,et al.  Dense Semantic Labeling with Atrous Spatial Pyramid Pooling and Decoder for High-Resolution Remote Sensing Imagery , 2018, Remote. Sens..

[36]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Qian Du,et al.  Deep Kernel Extreme-Learning Machine for the Spectral-Spatial Classification of Hyperspectral Imagery , 2018, Remote. Sens..

[38]  Yin Pan,et al.  Cloud Detection in Remote Sensing Images Based on Multiscale Features-Convolutional Neural Network , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[39]  Licheng Jiao,et al.  Divide-and-Conquer Dual-Architecture Convolutional Neural Network for Classification of Hyperspectral Images , 2019, Remote. Sens..

[40]  Yang Li,et al.  A Ship Rotation Detection Model in Remote Sensing Images Based on Feature Fusion Pyramid Network and Deep Reinforcement Learning , 2018, Remote. Sens..

[41]  Miaozhong Xu,et al.  DenseNet-Based Depth-Width Double Reinforced Deep Learning Neural Network for High-Resolution Remote Sensing Image Per-Pixel Classification , 2018, Remote. Sens..

[42]  V. M. Salerno,et al.  An Extreme Learning Machine Approach to Effective Energy Disaggregation , 2018, Electronics.

[43]  Mohan Trivedi,et al.  Segmentation of a Thematic Mapper Image Using the Fuzzy c-Means Clusterng Algorthm , 1986, IEEE Transactions on Geoscience and Remote Sensing.

[44]  Wanshou Jiang,et al.  Segmentation and Multi-Scale Convolutional Neural Network-Based Classification of Airborne Laser Scanner Data , 2018, Sensors.

[45]  D.A. Landgrebe,et al.  Classification with spatio-temporal interpixel class dependency contexts , 1992, IEEE Trans. Geosci. Remote. Sens..

[46]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[47]  Bo Qu,et al.  Deep semantic understanding of high resolution remote sensing image , 2016, 2016 International Conference on Computer, Information and Telecommunication Systems (CITS).

[48]  Xiangtao Zheng,et al.  Exploring Models and Data for Remote Sensing Image Caption Generation , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[49]  Chee Kheong Siew,et al.  Universal Approximation using Incremental Constructive Feedforward Networks with Random Hidden Nodes , 2006, IEEE Transactions on Neural Networks.

[50]  Deren Li,et al.  Remote sensing monitoring of multi-scale watersheds impermeability for urban hydrological evaluation , 2019, Remote Sensing of Environment.

[51]  Yanfei Zhong,et al.  Large patch convolutional neural networks for the scene classification of high spatial resolution imagery , 2016 .

[52]  Junjie Wu,et al.  An Optimal 2-D Spectrum Matching Method for SAR Ground Moving Target Imaging , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[53]  Yinghai Ke,et al.  Urban Land Use and Land Cover Classification Using Novel Deep Learning Models Based on High Spatial Resolution Satellite Imagery , 2018, Sensors.

[54]  Xiaocong Xu,et al.  Building Footprint Extraction from High-Resolution Images via Spatial Residual Inception Convolutional Neural Network , 2019, Remote. Sens..

[55]  Min Wang,et al.  A New Method for Region-Based Majority Voting CNNs for Very High Resolution Image Classification , 2018, Remote. Sens..

[56]  Fei Wang,et al.  Multi-Scale Semantic Segmentation and Spatial Relationship Recognition of Remote Sensing Images Based on an Attention Model , 2019, Remote. Sens..

[57]  Lei Guo,et al.  Effective and Efficient Midlevel Visual Elements-Oriented Land-Use Classification Using VHR Remote Sensing Images , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[58]  Guang-Bin Huang,et al.  Trends in extreme learning machines: A review , 2015, Neural Networks.