Object detection in optical remote sensing images by integrating object-to-object relationships

ABSTRACT In recent years, deep-learning-based methods for remote sensing image interpretation have undergone rapid development, due to the increasing amount of image data and the advanced techniques of machine learning. The abundant spatial and contextual information within the images is helpful to improve the interpretation performance. However, the contextual information is ignored by most of the current deep-learning-based methods. In this letter, we explore the contextual information by taking advantage of the object-to-object relationship. Then, the feature representation of the individual objects can be enhanced. To be specific, we first build a knowledge database which reveals the relationship between different categories and generate a region-to-region graph that indicates the relationship between different regions of interest (RoIs). For each RoI, the features of its related regions are then combined with the original region features, and the fused features are finally used for object detection. The experiments conducted on a public ten-class object detection dataset demonstrate the validity of the proposed method.

[1]  Xiangyu Zhang,et al.  DetNet: A Backbone network for Object Detection , 2018, ArXiv.

[2]  Yichen Wei,et al.  Relation Networks for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Stefan Lee,et al.  Graph R-CNN for Scene Graph Generation , 2018, ECCV.

[4]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Qing Liu,et al.  Accurate Object Localization in Remote Sensing Images Based on Convolutional Neural Networks , 2017, IEEE Transactions on Geoscience and Remote Sensing.

[6]  Chong-Wah Ngo,et al.  Evaluating bag-of-visual-words representations in scene classification , 2007, MIR '07.

[7]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[8]  Michael S. Bernstein,et al.  Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations , 2016, International Journal of Computer Vision.

[9]  Jefersson Alex dos Santos,et al.  Deep contextual description of superpixels for aerial urban scenes classification , 2017, 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS).

[10]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[11]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[13]  Sanja Fidler,et al.  The Role of Context for Object Detection and Semantic Segmentation in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[14]  Farid Melgani,et al.  Convolutional SVM Networks for Object Detection in UAV Imagery , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[15]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[18]  Rainer Lienhart,et al.  An extended set of Haar-like features for rapid object detection , 2002, Proceedings. International Conference on Image Processing.

[19]  Libao Zhang,et al.  Airport Detection and Aircraft Recognition Based on Two-Layer Saliency Model in High Spatial Resolution Remote-Sensing Images , 2017, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.

[20]  Yali Amit,et al.  Object Detection , 2020, Computer Vision, A Reference Guide.

[21]  Jocelyn Chanussot,et al.  Dynamic Multicontext Segmentation of Remote Sensing Images Based on Convolutional Networks , 2018, IEEE Transactions on Geoscience and Remote Sensing.

[22]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Junwei Han,et al.  Multi-class geospatial object detection and geographic image classification based on collection of part detectors , 2014 .

[24]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Serge J. Belongie,et al.  Object categorization using co-occurrence, location and appearance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.