Shape-Adaptive Kernel Network for Dense Object Detection

Dense object detectors that are applied over a regular, dense grid have advanced and drawn their attention in recent days. Their fully convolutional nature greatly advances the computational efficiency of object detectors compared to the two-stage detectors. However, the lack of the ability to adjust shape variation on a regular grid is still limited. In this paper we introduce a new framework, shape-adaptive kernel network, to handle spatial manipulation of input data in convolutional kernel space. At the heart of out approach is to align the original kernel space recovering shape variation of each input feature on regular grid. To this end, we propose a shape-adaptive kernel sampler to adjust dynamic convolutional kernel conditioned on input. To increase the flexibility of geometric transformation, a cascade refinement module is designed, which first estimates the global transformation grid and then estimates local offset in convolutional kernel space. Our experiments demonstrate the effectiveness of the shape-adaptive kernel network for dense object detection on various benchmarks.

[1]  Andrew Y. Ng,et al.  Multi-camera object detection for robotics , 2010, 2010 IEEE International Conference on Robotics and Automation.

[2]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[3]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Lingxiao Yang,et al.  Dynamic Anchor Feature Selection for Single-Shot Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[6]  Kai Chen,et al.  Region Proposal by Guided Anchoring , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[9]  Luc Van Gool,et al.  The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[10]  Eric P. Xing,et al.  CIRL: Controllable Imitative Reinforcement Learning for Vision-based Self-driving , 2018, ECCV.

[11]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[14]  Fuchun Sun,et al.  RON: Reverse Connection with Objectness Prior Networks for Object Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[16]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Luc Van Gool,et al.  Dynamic Filter Networks , 2016, NIPS.

[18]  Kai Chen,et al.  MMDetection: Open MMLab Detection Toolbox and Benchmark , 2019, ArXiv.

[19]  Zhaoxiang Zhang,et al.  Revisiting Feature Alignment for One-stage Object Detection , 2019, ArXiv.

[20]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[21]  Yi Li,et al.  R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[22]  Kihong Park,et al.  Unified multi-spectral pedestrian detection based on probabilistic fusion networks , 2018, Pattern Recognit..

[23]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[24]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Stephen Lin,et al.  RepPoints: Point Set Representation for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Chang D. Yoo,et al.  Cascade RPN: Delving into High-Quality Region Proposal Network with Adaptive Convolution , 2019, NeurIPS.

[27]  Xizhou Zhu,et al.  Deformable Kernels: Adapting Effective Receptive Fields for Object Deformation , 2020, ICLR.

[28]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).