PBNet: Part-based convolutional neural network for complex composite object detection in remote sensing imagery

Abstract In recent years, deep learning-based algorithms have brought great improvements to rigid object detection. In addition to rigid objects, remote sensing images also contain many complex composite objects, such as sewage treatment plants, golf courses, and airports, which have neither a fixed shape nor a fixed size. In this paper, we validate through experiments that the results of existing methods in detecting composite objects are not satisfying enough. Therefore, we propose a unified part-based convolutional neural network (PBNet), which is specifically designed for composite object detection in remote sensing imagery. PBNet treats a composite object as a group of parts and incorporates part information into context information to improve composite object detection. Correct part information can guide the prediction of a composite object, thus solving the problems caused by various shapes and sizes. To generate accurate part information, we design a part localization module to learn the classification and localization of part points using bounding box annotation only. A context refinement module is designed to generate more discriminative features by aggregating local context information and global context information, which enhances the learning of part information and improve the ability of feature representation. We selected three typical categories of composite objects from a public dataset to conduct experiments to verify the detection performance and generalization ability of our method. Meanwhile, we build a more challenging dataset about a typical kind of complex composite objects, i.e., sewage treatment plants. It refers to the relevant information from authorities and experts. This dataset contains sewage treatment plants in seven cities in the Yangtze valley, covering a wide range of regions. Comprehensive experiments on two datasets show that PBNet surpasses the existing detection algorithms and achieves state-of-the-art accuracy.

[1]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Timothy F. Cootes,et al.  Active Appearance Models , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Steve Branson,et al.  Efficient Large-Scale Structured Learning , 2013, CVPR.

[4]  Fuchun Sun,et al.  HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Pietro Perona,et al.  A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry , 1998, ECCV.

[6]  Kun Fu,et al.  FMSSD: Feature-Merged Single-Shot Detection for Multiscale Objects in Large-Scale Remote Sensing Imagery , 2020, IEEE Transactions on Geoscience and Remote Sensing.

[7]  Xin Huang,et al.  Deep networks under scene-level supervision for multi-class geospatial object detection from remote sensing images , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[8]  Frédéric Jurie,et al.  Vehicle detection in aerial imagery : A small target detection benchmark , 2016, J. Vis. Commun. Image Represent..

[9]  Stephen Lin,et al.  RepPoints: Point Set Representation for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Xinwei Zheng,et al.  Efficient Saliency-Based Object Detection in Remote Sensing Images Using Deep Belief Networks , 2016, IEEE Geoscience and Remote Sensing Letters.

[11]  Lin Lei,et al.  Multi-scale object detection in remote sensing imagery with convolutional neural networks , 2018, ISPRS Journal of Photogrammetry and Remote Sensing.

[12]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Yue Zhang,et al.  Rotation-aware and multi-scale convolutional neural network for object detection in remote sensing images , 2020 .

[14]  Jian Dong,et al.  Attentive Contexts for Object Detection , 2016, IEEE Transactions on Multimedia.

[15]  Xian Sun,et al.  A generic discriminative part-based model for geospatial object detection in optical remote sensing images , 2015 .

[16]  Peter Reinartz,et al.  Towards Multi-class Object Detection in Unconstrained Remote Sensing Imagery , 2018, ACCV.

[17]  Nojun Kwak,et al.  Enhancement of SSD by concatenating feature maps for object detection , 2017, BMVC.

[18]  Wei Guo,et al.  Geospatial Object Detection in High Resolution Satellite Images Based on Multi-Scale Convolutional Neural Network , 2018, Remote. Sens..

[19]  Alan L. Yuille,et al.  Feature extraction from faces using deformable templates , 2004, International Journal of Computer Vision.

[20]  Georgios Tzimiropoulos,et al.  Human Pose Estimation via Convolutional Part Heatmap Regression , 2016, ECCV.

[21]  Daniel Snow,et al.  Efficient Deformable Template Detection and Localization without User Initialization , 2000, Comput. Vis. Image Underst..

[22]  Yiping Yang,et al.  Ship Rotated Bounding Box Space for Ship Extraction From High-Resolution Optical Satellite Images With Complex Backgrounds , 2016, IEEE Geoscience and Remote Sensing Letters.

[23]  Trevor Darrell,et al.  Deep Layer Aggregation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[24]  Lei Guo,et al.  Object Detection in Optical Remote Sensing Images Based on Weakly Supervised Learning and High-Level Feature Learning , 2015, IEEE Transactions on Geoscience and Remote Sensing.

[25]  Gellért Máttyus,et al.  Fast Multiclass Vehicle Detection on Aerial Images , 2015, IEEE Geoscience and Remote Sensing Letters.

[26]  Fuqiang Zhou,et al.  FSSD: Feature Fusion Single Shot Multibox Detector , 2017, ArXiv.

[27]  Gang Wan,et al.  Object Detection in Optical Remote Sensing Images: A Survey and A New Benchmark , 2020, ISPRS Journal of Photogrammetry and Remote Sensing.

[28]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[30]  Gong Cheng,et al.  P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization , 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Trevor Darrell,et al.  Part-Based R-CNNs for Fine-Grained Category Detection , 2014, ECCV.

[32]  Junwei Han,et al.  Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images , 2016, IEEE Transactions on Geoscience and Remote Sensing.

[33]  Xiaoqiang Lu,et al.  Hierarchical and Robust Convolutional Neural Network for Very High-Resolution Remote Sensing Object Detection , 2019, IEEE Transactions on Geoscience and Remote Sensing.

[34]  Hei Law,et al.  CornerNet: Detecting Objects as Paired Keypoints , 2018, ECCV.

[35]  Quoc V. Le,et al.  EfficientDet: Scalable and Efficient Object Detection , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[37]  Yansheng Li,et al.  Error-Tolerant Deep Learning for Remote Sensing Image Scene Classification , 2020, IEEE Transactions on Cybernetics.

[38]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Junjie Yan,et al.  Grid R-CNN , 2018, 1811.12030.

[41]  Shiming Xiang,et al.  Vehicle Detection in Satellite Images by Hybrid Deep Convolutional Neural Networks , 2014, IEEE Geoscience and Remote Sensing Letters.

[42]  Jiebo Luo,et al.  DOTA: A Large-Scale Dataset for Object Detection in Aerial Images , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[43]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Kavita Bala,et al.  Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Daniel P. Huttenlocher,et al.  Spatial priors for part-based recognition using statistical models , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[46]  Qixiang Ye,et al.  Orientation robust object detection in aerial images using deep convolutional neural network , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[47]  Junwei Han,et al.  Multi-class geospatial object detection and geographic image classification based on collection of part detectors , 2014 .

[48]  Xingyi Zhou,et al.  Bottom-Up Object Detection by Grouping Extreme and Center Points , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[50]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[51]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[52]  Jun Zhou,et al.  Multiscale Visual Attention Networks for Object Detection in VHR Remote Sensing Images , 2019, IEEE Geoscience and Remote Sensing Letters.

[53]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Junwei Han,et al.  A Survey on Object Detection in Optical Remote Sensing Images , 2016, ArXiv.

[55]  Naif Alajlan,et al.  Deep Learning Approach for Car Detection in UAV Imagery , 2017, Remote. Sens..

[56]  Gong Cheng,et al.  Multi-modal deep learning for landform recognition , 2019 .

[57]  Wesam A. Sakla,et al.  A Large Contextual Dataset for Classification, Detection and Counting of Cars with Deep Learning , 2016, ECCV.

[58]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[59]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[60]  Jiří Jaromír Klemeš,et al.  Value chain mapping of the water and sewage treatment to contribute to sustainability. , 2019, Journal of environmental management.

[61]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[62]  Fei Wang,et al.  CentripetalNet: Pursuing High-Quality Keypoint Pairs for Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Jun Zhang,et al.  Geospatial Object Detection in Remote Sensing Imagery Based on Multiscale Single-Shot Detector with Activated Semantics , 2018, Remote. Sens..