DG-FPN: Learning Dynamic Feature Fusion Based on Graph Convolution Network For Object Detection

Feature Pyramid Network (FPN) is one of the most popular feature fusion methods to address the multi-scale issue in object detection. Current FPN-based methods are mostly designed manually, which cannot guarantee the optimal feature fusion. Besides, the predetermined methods generally provide the same strategy to various targets, which are not distinctive among targets with different scales. In this paper, we present a novel dynamic feature fusion method based on the graph convolution network (GCN), called DG-FPN. The proposed GCN-based method can dynamically transfer knowledge with learnable weights across all nodes, making it possible to learn the optimal feature fusion for detectors. Furthermore, the pixel-based adjacency matrix is proposed to offer customized fusion strategy for each target, achieving dynamic feature fusion. To optimize matrix-driven learning, semantic information is introduced to guide the process of fusion. Experiments show that DG-FPN significantly improves the performance of baseline networks on the challenging MS-COCO object benchmark, especially in small objects.

[1]  Huajun Feng,et al.  Libra R-CNN: Towards Balanced Learning for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Shifeng Zhang,et al.  Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[6]  Xuan Jiang,et al.  A Hybrid Multi-atrous and Multi-scale Network for Liver Lesion Detection , 2019, MLMI@MICCAI.

[7]  Quoc V. Le,et al.  NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Songtao Liu,et al.  Learning Spatial Fusion for Single-Shot Object Detection , 2019, ArXiv.

[9]  Shu Liu,et al.  Path Aggregation Network for Instance Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Kai Ma,et al.  Attentive CT Lesion Detection Using Deep Pyramid Inference with Multi-Scale Booster , 2019, MICCAI.

[11]  Risheng Huang,et al.  RepGN: Object Detection with Relational Proposal Graph Network , 2019, ArXiv.

[12]  Larry S. Davis,et al.  An Analysis of Scale Invariance in Object Detection - SNIP , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[13]  Larry S. Davis,et al.  SNIPER: Efficient Multi-Scale Training , 2018, NeurIPS.

[14]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[15]  Wangmeng Zuo,et al.  Perspective-Guided Convolution Networks for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[16]  Hong Wang,et al.  Evolving boxes for fast vehicle detection , 2017, 2017 IEEE International Conference on Multimedia and Expo (ICME).

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[19]  Jieping Ye,et al.  Object Detection in 20 Years: A Survey , 2019, Proceedings of the IEEE.

[20]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Shiguang Shan,et al.  Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Xingyi Zhou,et al.  Objects as Points , 2019, ArXiv.

[23]  Xiu-Shen Wei,et al.  Multi-Label Image Recognition With Graph Convolutional Networks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Ruoyu Li,et al.  Adaptive Graph Convolutional Neural Networks , 2018, AAAI.