Localizing Firearm Carriers By Identifying Human-Object Pairs

Visual identification of gunmen in a crowd is a challenging problem, that requires resolving the association of a person with an object (firearm). We present a novel approach to address this problem, by defining human-object interaction (and non-interaction) bounding boxes. In a given image, human and firearms are separately detected. Each detected human is paired with each detected firearm, allowing us to create a paired bounding box that contains both object and the human. A network is trained to classify these paired-bounding-boxes into human carrying the identified firearm or not. Extensive experiments were performed to evaluate the effectiveness of the algorithm, including exploiting full pose of the human, hand-keypoints, and their association with the firearm. The knowledge of spatially localized features is key to the success of our method by using multi-size proposals with adaptive average pooling. We have also extended a previously existing firearm detection dataset, by adding more images and tagging in the extended dataset the human-firearm pairs (including bounding boxes for firearms and gunmen). The experimental results $({78.5 AP}_{hold})$ demonstrate effectiveness of the proposed method.

[1]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[2]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[4]  Wei Liu,et al.  DSSD : Deconvolutional Single Shot Detector , 2017, ArXiv.

[5]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[7]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[8]  Toby P. Breckon,et al.  Using Deep Convolutional Neural Network Architectures for Object Classification and Detection Within X-Ray Baggage Security Imagery , 2018, IEEE Transactions on Information Forensics and Security.

[9]  Francisco Herrera,et al.  Automatic handgun detection alarm in videos using deep learning , 2017, Neurocomputing.

[10]  Kaiming He,et al.  Detecting and Recognizing Human-Object Interactions , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Larry S. Davis,et al.  An Analysis of Scale Invariance in Object Detection - SNIP , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Hao Chen,et al.  FCOS: Fully Convolutional One-Stage Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[13]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Yaser Sheikh,et al.  Hand Keypoint Detection in Single Images Using Multiview Bootstrapping , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Jorma Laaksonen,et al.  Deep Contextual Attention for Human-Object Interaction Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[18]  Arif Mahmood,et al.  Orientation Aware Object Detection with Application to Firearms , 2019, ArXiv.

[19]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Qi Tian,et al.  CenterNet: Keypoint Triplets for Object Detection , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[24]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.