Improving Object Detection with Relation Graph Inference

Many classic object detection approaches have proven that detection performance can be improved by adding the object’s context information. However, only a few methods have attempted to exploit the object-to-object relationship during learning. The reason for this is that objects may appear at different locations in an image, with an arbitrary size and scale. This makes it difficult to model the objects in a unified way within a network. Inspired by Graph Convolutional Network (GCN), we propose a detection algorithm that can infer the relationship among multiple objects during the inference, achieved by constructing a relation graph dynamically with a self-adopted attention mechanism. The relation graph encodes both the geometric and visual relationship between objects. This can enrich the object feature by aggregating the information from the object and its relevant neighbors. Experiments show that our proposed module can efficiently improve the detection performance of existing object detectors.

[1]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Serge J. Belongie,et al.  Object categorization using co-occurrence, location and appearance , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Zhuowen Tu,et al.  Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[7]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[8]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[9]  Kin-Man Lam,et al.  Fast Vehicle Detection with Lateral Convolutional Neural Network , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Pietro Liò,et al.  Graph Attention Networks , 2017, ICLR.

[11]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[12]  Xinlei Chen,et al.  Spatial Memory for Context Reasoning in Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[13]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.