Object Detection in High-Resolution Panchromatic Images Using Deep Models and Spatial Template Matching

Automatic object detection from remote sensing images has attracted a significant attention due to its importance in both military and civilian fields. However, the low confidence of the candidates restricts the recognition of potential objects, and the unreasonable predicted boxes result in false positives (FPs). To address these issues, an accurate and fast object detection method called the refined single-shot multibox detector (RSSD) is proposed, consisting of a single-shot multibox detector (SSD), a refined network (RefinedNet), and a class-specific spatial template matching (STM) module. In the training stage, fed with augmented samples in diverse variation, the SSD can efficiently extract multiscale features for object classification and location. Meanwhile, RefinedNet is trained with cropped objects from the training set to further enhance the ability to distinguish each class of objects and the background. Class-specific spatial templates are also constructed from the statistics of objects of each class to provide reliable object templates. During the test phase, RefinedNet improves the confidence of potential objects from the predicted results of SSD and suppresses that of the background, which promotes the detection rate. Furthermore, several grotesque candidates are rejected by the well-designed class-specific spatial templates, thus reducing the false alarm rate. These three parts constitute a monolithic architecture, which contributes to the detection accuracy and maintains the speed. Experiments on high-resolution panchromatic (PAN) images of satellites GaoFen-2 and JiLin-1 demonstrate the effectiveness and efficiency of the proposed modules and the whole framework.

[1]  Haibo He,et al.  Learning from Imbalanced Data , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Laurent Durieux,et al.  A method for monitoring building construction in urban sprawl areas using object-based analysis of Spot 5 images and existing GIS data , 2008 .

[3]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ping Zhong,et al.  A Multiple Conditional Random Fields Ensemble Model for Urban Area Detection in Remote Sensing Optical Images , 2007, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Gangyao Kuang,et al.  Squeeze and Excitation Rank Faster R-CNN for Ship Detection in SAR Images , 2019, IEEE Geoscience and Remote Sensing Letters.

[7]  Gui-Song Xia,et al.  Learning RoI Transformer for Detecting Oriented Objects in Aerial Images , 2018, ArXiv.

[8]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Zhao Lin,et al.  Contextual Region-Based Convolutional Neural Network with Multilayer Fusion for SAR Ship Detection , 2017, Remote. Sens..

[10]  Xiaobin Li,et al.  Object Detection Using Convolutional Neural Networks in a Coarse-to-Fine Manner , 2017, IEEE Geoscience and Remote Sensing Letters.

[11]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[12]  Xiao Xiang Zhu,et al.  HSF-Net: Multiscale Deep Feature Embedding for Ship Detection in Optical Remote Sensing Imagery , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Jitendra Malik,et al.  DeepBox: Learning Objectness with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[14]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[15]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[16]  Lorenzo Bruzzone,et al.  Earthquake Damage Assessment of Buildings Using VHR Optical and SAR Imagery , 2010, IEEE Transactions on Geoscience and Remote Sensing.

[17]  Huanxin Zou,et al.  Deep Convolutional Highway Unit Network for SAR Target Classification With Limited Labeled Training Data , 2017, IEEE Geoscience and Remote Sensing Letters.

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Yanfei Zhong,et al.  Multi-class geospatial object detection based on a position-sensitive balancing framework for high spatial resolution remote sensing imagery , 2018 .

[20]  Chee Peng Lim,et al.  Robust Vehicle Detection in Aerial Images Using Bag-of-Words and Orientation Aware Scanning , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[21]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[22]  Jun Zhou,et al.  Multiscale Visual Attention Networks for Object Detection in VHR Remote Sensing Images , 2019, IEEE Geoscience and Remote Sensing Letters.

[23]  Derek Hoiem,et al.  Diagnosing Error in Object Detectors , 2012, ECCV.

[24]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[25]  Dong Xu,et al.  Learning Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection , 2019, IEEE Transactions on Image Processing.

[26]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[28]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[29]  Ke Li,et al.  Rotation-Insensitive and Context-Augmented Object Detection in Remote Sensing Images , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[30]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Yang Long,et al.  Airport Detection Based on a Multiscale Fusion Feature for Optical Remote Sensing Images , 2017, IEEE Geoscience and Remote Sensing Letters.