New Threats Against Object Detector with Non-local Block

The introduction of non-local blocks to the traditional CNN architecture enhances its performance for various computer vision tasks by improving its capabilities of capturing long-range dependencies. However, the usage of non-local blocks may also introduce new threats to computer vision systems. Therefore, it is important to study the threats caused by non-local blocks before directly applying them on commercial systems. In this paper, two new threats named disappearing attack and appearing attack against object detectors with a non-local block are investigated. The former aims at misleading an object detector with a non-local block such that it is unable to detect a target object category while the latter aims at misleading the object detector such that it detects a predefined object category, which is not present in images. Different from the existing attacks against object detectors, these threats are able to be performed in long range cases. This means that the target object and the universal adversarial patches learned from the proposed algorithms can have long distance between them. To examine the threats, digital and physical experiments are conducted on Faster R-CNN with a non-local block and 6331 images from 56 videos. The experiments show that the universal patches are able to mislead the detector with greater probabilities. To explain the threats from non-local blocks, the reception fields of CNN models with and without non-local blocks are studied empirically and theoretically.

[1]  Qiong Cao,et al.  Non-Local Recurrent Neural Memory for Supervised Sequence Modeling , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[2]  Liang Lin,et al.  Non-locally Enhanced Encoder-Decoder Network for Single Image De-raining , 2018, ACM Multimedia.

[3]  Jinwen Ma,et al.  Spatial-Aware Non-Local Attention for Fashion Landmark Detection , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[4]  Yun Fu,et al.  Residual Non-local Attention Networks for Image Restoration , 2019, ICLR.

[5]  Jaeyoung Lee,et al.  Improving Video Captioning with Non-Local Neural Networks , 2018, 2018 IEEE International Conference on Consumer Electronics - Asia (ICCE-Asia).

[6]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[7]  Alan L. Yuille,et al.  Adversarial Examples for Semantic Segmentation and Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Stephen Lin,et al.  GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[9]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[10]  Kwok-Yan Lam,et al.  Adversarial Signboard against Object Detector , 2019, BMVC.

[11]  Hanqing Lu,et al.  Non-Local Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, 1805.07694.

[12]  Qi Tian,et al.  Fast Non-Local Neural Networks with Spectral Residual Learning , 2019, ACM Multimedia.

[13]  Shimon Ullman,et al.  Efficient Coarse-to-Fine Non-Local Module for the Detection of Small Objects , 2018, BMVC.

[14]  Xin Zhao,et al.  Automatic Building Extraction From High-Resolution Aerial Imagery via Fully Convolutional Encoder-Decoder Network With Non-Local Block , 2020, IEEE Access.

[15]  Junghoon Seo,et al.  NL-LinkNet: Toward Lighter But More Accurate Road Extraction With Nonlocal Operations , 2019, IEEE Geoscience and Remote Sensing Letters.

[16]  Ahad Harati,et al.  Salient Object Detection in Video using Deep Non-Local Neural Networks , 2018, J. Vis. Commun. Image Represent..

[17]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Xiang Bai,et al.  Asymmetric Non-Local Neural Networks for Semantic Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[19]  Kwok-Yan Lam,et al.  Attacking Object Detectors Without Changing the Target Object , 2019, PRICAI.

[20]  Pedro H. O. Pinheiro,et al.  Adversarial Framing for Image and Video Classification , 2018, AAAI.

[21]  Raquel Urtasun,et al.  Understanding the Effective Receptive Field in Deep Convolutional Neural Networks , 2016, NIPS.

[22]  Dawn Song,et al.  Physical Adversarial Examples for Object Detectors , 2018, WOOT @ USENIX Security Symposium.

[23]  Errui Ding,et al.  Compact Generalized Non-local Network , 2018, NeurIPS.

[24]  Xin Xu,et al.  Extended Non-local Feature for Visual Saliency Detection in Low Contrast Images , 2018, ECCV Workshops.

[25]  Ping Gong,et al.  MASTER: Multi-Aspect Non-local Network for Scene Text Recognition , 2019, Pattern Recognit..

[26]  Abhinav Gupta,et al.  Non-local Neural Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[27]  Chenglong Li,et al.  Edge-Guided Non-Local Fully Convolutional Network for Salient Object Detection , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[28]  Lei Wang,et al.  Coarse-to-Fine Image Inpainting via Region-wise Convolutions and Non-Local Correlation , 2019, IJCAI.

[29]  Lei Shi,et al.  Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Hong Song,et al.  Liver Segmentation in CT Images Using a Non-Local Fully Convolutional Neural Network , 2019, 2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[31]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Christian Poellabauer,et al.  Second-Order Non-Local Attention Networks for Person Re-Identification , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[34]  Jiajun Lu,et al.  Adversarial Examples that Fool Detectors , 2017, ArXiv.

[35]  Duen Horng Chau,et al.  ShapeShifter: Robust Physical Adversarial Attack on Faster R-CNN Object Detector , 2018, ECML/PKDD.

[36]  Shan Yu,et al.  Skeleton-Based Action Recognition with Synchronous Local and Non-Local Spatio-Temporal Learning and Frequency Attention , 2018, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[37]  Lingxiao He,et al.  Video-based Person Re-identification via 3D Convolutional Networks and Non-local Attention , 2018, ACCV.

[38]  David A. Forsyth,et al.  NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles , 2017, ArXiv.

[39]  Xing Zhang,et al.  Non-local NetVLAD Encoding for Video Classification , 2018, ECCV Workshops.

[40]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.