Geospatial Object Detection in Remote Sensing Imagery Based on Multiscale Single-Shot Detector with Activated Semantics

Geospatial object detection from high spatial resolution (HSR) remote sensing imagery is a heated and challenging problem in the field of automatic image interpretation. Despite convolutional neural networks (CNNs) having facilitated the development in this domain, the computation efficiency under real-time application and the accurate positioning on relatively small objects in HSR images are two noticeable obstacles which have largely restricted the performance of detection methods. To tackle the above issues, we first introduce semantic segmentation-aware CNN features to activate the detection feature maps from the lowest level layer. In conjunction with this segmentation branch, another module which consists of several global activation blocks is proposed to enrich the semantic information of feature maps from higher level layers. Then, these two parts are integrated and deployed into the original single shot detection framework. Finally, we use the modified multi-scale feature maps with enriched semantics and multi-task training strategy to achieve end-to-end detection with high efficiency. Extensive experiments and comprehensive evaluations on a publicly available 10-class object detection dataset have demonstrated the superiority of the presented method.

[1]  Gang Chen,et al.  Identification of Forested Landslides Using LiDar Data, Object-based Image Analysis, and Machine Learning Algorithms , 2015, Remote. Sens..

[2]  Yu Li,et al.  Automatic Target Detection in High-Resolution Remote Sensing Images Using Spatial Sparse Coding Bag-of-Words Model , 2012, IEEE Geoscience and Remote Sensing Letters.

[3]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[4]  Yansheng Li,et al.  Feature guided Gaussian mixture model with semi-supervised EM and local geometric constraint for retinal image registration , 2017, Inf. Sci..

[5]  Shlomo Geva,et al.  Adaptive nearest neighbor pattern classification , 1991, IEEE Trans. Neural Networks.

[6]  Michael S. Lew,et al.  Deep learning for visual understanding: A review , 2016, Neurocomputing.

[7]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Tao Xiang,et al.  In Defence of Negative Mining for Annotating Weakly Labelled Data , 2012, ECCV.

[9]  Shuicheng Yan,et al.  Learning Object Detectors from Scratch with Gated Recurrent Feature Pyramids , 2017, ArXiv.

[10]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[11]  Dong-Chen He,et al.  Detection of Buildings in Multispectral Very High Spatial Resolution Images Using the Percentage Occupancy Hit-or-Miss Transform , 2014, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[12]  Junjun Jiang,et al.  Robust Feature Matching for Remote Sensing Image Registration via Locally Linear Transforming , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[14]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[15]  Junwei Han,et al.  A Survey on Object Detection in Optical Remote Sensing Images , 2016, ArXiv.

[16]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[17]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[18]  Junwei Han,et al.  Efficient, simultaneous detection of multi-class geospatial targets based on visual saliency modeling and discriminative learning of sparse coding , 2014 .

[19]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[20]  Qi Tian,et al.  Feature representation for statistical-learning-based object detection: A review , 2015, Pattern Recognit..

[21]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[22]  Junwei Han,et al.  Object detection in remote sensing imagery using a discriminatively trained mixture model , 2013 .

[23]  Junwei Han,et al.  Multi-class geospatial object detection and geographic image classification based on collection of part detectors , 2014 .

[24]  Deren Li,et al.  Object Classification of Aerial Images With Bag-of-Visual Words , 2010, IEEE Geoscience and Remote Sensing Letters.

[25]  Ali Ozgun Ok,et al.  Automated detection of buildings from single VHR multispectral images using shadow information and graph cuts , 2013 .

[26]  Yihua Tan,et al.  Unsupervised Multilayer Feature Learning for Satellite Image Scene Classification , 2016, IEEE Geoscience and Remote Sensing Letters.

[27]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Zhiqiang Shen,et al.  DSOD: Learning Deeply Supervised Object Detectors from Scratch , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Bo Wang,et al.  Single-Shot Object Detection with Enriched Semantics , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[30]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Zhao Lin,et al.  Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection , 2017, Remote. Sens..

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Anil K. Jain,et al.  Object detection using gabor filters , 1997, Pattern Recognit..

[34]  Nikos Komodakis,et al.  Object Detection via a Multi-region and Semantic Segmentation-Aware CNN Model , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Yongjun Zhang,et al.  Large-Scale Remote Sensing Image Retrieval by Deep Hashing Neural Networks , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[36]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Liangpei Zhang,et al.  An Efficient and Robust Integrated Geospatial Object Detection Framework for High Spatial Resolution Remote Sensing Imagery , 2017, Remote. Sens..

[38]  Nicolas Longépé,et al.  AIS-Based Evaluation of Target Detectors and SAR Sensors Characteristics for Maritime Surveillance , 2015, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[39]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[40]  K. Vani,et al.  Water flow based geometric active deformable model for road network , 2015 .

[41]  Çaglar Senaras,et al.  Automated Detection of Arbitrarily Shaped Buildings in Complex Environments From Monocular VHR Optical Satellite Imagery , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[42]  Thomas Blaschke,et al.  Geographic Object-Based Image Analysis – Towards a new paradigm , 2014, ISPRS journal of photogrammetry and remote sensing : official publication of the International Society for Photogrammetry and Remote Sensing.

[43]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[44]  Wei Guo,et al.  Geospatial Object Detection in High Resolution Satellite Images Based on Multi-Scale Convolutional Neural Network , 2018, Remote. Sens..

[45]  Abhinav Gupta,et al.  Contextual Priming and Feedback for Faster R-CNN , 2016, ECCV.

[46]  Ping Zhong,et al.  A Multiple Conditional Random Fields Ensemble Model for Urban Area Detection in Remote Sensing Optical Images , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[47]  Wei Liu,et al.  DSSD : Deconvolutional Single Shot Detector , 2017, ArXiv.

[48]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[49]  Bo Du,et al.  A Sparse Representation-Based Binary Hypothesis Model for Target Detection in Hyperspectral Images , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[50]  P. N. Druzhkov,et al.  A survey of deep learning methods and software tools for image classification and object detection , 2016, Pattern Recognition and Image Analysis.

[51]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.