Improving Object Detection Using Weakly-Annotated Auxiliary Multi-Label Segmentation

With the rapid development of deep learning techniques, the performance of object detection has increased significantly. Recently, several approaches on joint learning of object detection and semantic segmentation have been proposed to exploit the complementary benefits of the two highly correlated tasks. In this work, we propose a weakly-annotated auxiliary multi-label segmentation network that boosts object detection performance without additional computational cost at inference. The proposed auxiliary segmentation network is trained using weakly-annotated dataset and therefore does not require expensive pixel-level annotations for training. Different from the previous approaches, we use multi-label segmentation to jointly supervise auxiliary segmentation and object detection for better occlusion handling. The proposed method can be integrated with any one-stage object detector such as RetinaNet, YOLOv3, YOLOv4, or SSD. Our experimental results on the MS COCO dataset show that the proposed method can improve the performance of popular one-stage object detectors without slowing down the inference speed regardless of the sub-optimal training sample selection schemes.