论文信息 - Intensive Positioning Network for Remote Sensing Image Captioning

Intensive Positioning Network for Remote Sensing Image Captioning

This paper focuses on solving the problem of information loss during the generation of remote sensing image captions. In the field of artificial intelligence, the automatic description of remote sensing images is an important but rarely studied task. In the traditional framework, due to the higher pixels of the remote sensing image and the smaller target, when the image is processed and classified, the information is largely lost. In this case, we propose a new remote sensing image captioning framework using deep learning technology and attention mechanism. The experimental results show that the model can generate a full sentence description for remote sensing images.

Jiawei Chen | Shengsheng Wang | Guangyao Wang

[1] Lukás Burget,et al. Recurrent neural network based language model , 2010, INTERSPEECH.

[2] Zhenwei Shi,et al. Can a Machine Generate Humanlike Language Descriptions for a Remote Sensing Image? , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[3] Xinlei Chen,et al. Mind's eye: A recurrent visual representation for image caption generation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[5] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[6] Samy Bengio,et al. Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[8] Lukás Burget,et al. Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9] Geoffrey E. Hinton,et al. Generating Text with Recurrent Neural Networks , 2011, ICML.

[10] Fei-Fei Li,et al. Deep visual-semantic alignments for generating image descriptions , 2015, CVPR.

[11] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.