Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

Recent studies have shown that DNNs can be compromised by backdoor attacks crafted at training time. A backdoor attack installs a backdoor into the victim model by injecting a backdoor pattern into a small proportion of the training data. At test time, the victim model behaves normally on clean test data, yet consistently predicts a specific (likely incorrect) target class whenever the backdoor pattern is present in a test example. While existing backdoor attacks are effective, they are not stealthy. The modifications made on training data or labels are often suspicious and can be easily detected by simple data filtering or human inspection. In this paper, we present a new type of backdoor attack inspired by an important natural phenomenon: reflection. Using mathematical modeling of physical reflection models, we propose reflection backdoor (Refool) to plant reflections as backdoor into a victim model. We demonstrate on 3 computer vision tasks and 5 datasets that, Refool can attack state-of-the-art DNNs with high success rate, and is resistant to state-of-the-art backdoor defenses.

[1]  Andrew L. Beam,et al.  Adversarial attacks on medical machine learning , 2019, Science.

[2]  Hyunsoo Yoon,et al.  FriendNet Backdoor: Indentifying Backdoor Attack that is safe for Friendly Deep Neural Network , 2020, ICSIM.

[3]  Michael S. Brown,et al.  Single Image Layer Separation Using Relative Smoothness , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Aleksander Madry,et al.  Clean-Label Backdoor Attacks , 2018 .

[5]  Damith C. Ranasinghe,et al.  Februus: Input Purification Defense Against Trojan Attacks on Deep Neural Network Systems , 2020, ACSAC.

[6]  Bo Li,et al.  DBA: Distributed Backdoor Attacks against Federated Learning , 2020, ICLR.

[7]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.

[8]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[9]  Andreas Ekelhart,et al.  Backdoor Attacks in Neural Networks - A Systematic Evaluation on Multiple Traffic Sign Datasets , 2019, CD-MAKE.

[10]  Ben Y. Zhao,et al.  Latent Backdoor Attacks on Deep Neural Networks , 2019, CCS.

[11]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[12]  Minhui Xue,et al.  Invisible Backdoor Attacks on Deep Neural Networks via Steganography and Regularization , 2019 .

[13]  Mauro Barni,et al.  Luminance-based video backdoor attack against anti-spoofing rebroadcast detection , 2019, 2019 IEEE 21st International Workshop on Multimedia Signal Processing (MMSP).

[14]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[15]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[16]  Yu Li,et al.  Semantic Guided Single Image Reflection Removal , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[17]  Mauro Barni,et al.  A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning , 2019, 2019 IEEE International Conference on Image Processing (ICIP).

[18]  Percy Liang,et al.  Certified Defenses for Data Poisoning Attacks , 2017, NIPS.

[19]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Damith Chinthana Ranasinghe,et al.  STRIP: a defence against trojan attacks on deep neural networks , 2019, ACSAC.

[21]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[22]  Bo Liu,et al.  Unsupervised Ensemble Strategy for Retinal Vessel Segmentation , 2019, MICCAI.

[23]  Benjamin Edwards,et al.  Detecting Backdoor Attacks on Deep Neural Networks by Activation Clustering , 2018, SafeAI@AAAI.

[24]  Dawn Song,et al.  Natural Adversarial Examples , 2019, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Lujo Bauer,et al.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[26]  Binghui Wang,et al.  Backdoor Attacks to Graph Neural Networks , 2020, ArXiv.

[27]  Luc Van Gool,et al.  Multi-view traffic sign detection, recognition, and 3D localisation , 2014, Machine Vision and Applications.

[28]  Zhen Xiang,et al.  A Benchmark Study Of Backdoor Data Poisoning Defenses For Deep Neural Network Classifiers And A Novel Defense , 2019, 2019 IEEE 29th International Workshop on Machine Learning for Signal Processing (MLSP).

[29]  Imari Sato,et al.  Pathological Evidence Exploration in Deep Retinal Image Diagnosis , 2018, AAAI.

[30]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  Davide Cozzolino,et al.  Combining PRNU and noiseprint for robust and efficient device source identification , 2020, EURASIP J. Inf. Secur..

[32]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[33]  Sencun Zhu,et al.  Backdoor Embedding in Convolutional Neural Network Models via Invisible Perturbation , 2018, CODASPY.

[34]  Yu Li,et al.  Unsupervised Learning for Intrinsic Image Decomposition From a Single Image , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[37]  Yufeng Li,et al.  A Backdoor Attack Against LSTM-Based Text Classification Systems , 2019, IEEE Access.

[38]  James Bailey,et al.  On the Convergence and Robustness of Adversarial Training , 2021, ICML.

[39]  James Bailey,et al.  Imbalanced Gradients: A New Cause of Overestimated Adversarial Robustness , 2020, ArXiv.

[40]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[41]  Jerry Li,et al.  Spectral Signatures in Backdoor Attacks , 2018, NeurIPS.

[42]  Ananda Theertha Suresh,et al.  Can You Really Backdoor Federated Learning? , 2019, ArXiv.

[43]  Rainer Böhme,et al.  Trembling triggers: exploring the sensitivity of backdoors in DNN-based face recognition , 2020, EURASIP J. Inf. Secur..

[44]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[45]  Feng Lu,et al.  Separate in Latent Space: Unsupervised Single Image Layer Separation , 2019, AAAI.

[46]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[47]  Brent Lagesse,et al.  Analysis of Causative Attacks against SVMs Learning from Data Streams , 2017, IWSPA@CODASPY.

[48]  Claudia Eckert,et al.  Support vector machines under adversarial label contamination , 2015, Neurocomputing.

[49]  James Bailey,et al.  Improving Adversarial Robustness Requires Revisiting Misclassified Examples , 2020, ICLR.

[50]  James Bailey,et al.  Adversarial Camouflage: Hiding Physical-World Attacks With Natural Styles , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Ling-Yu Duan,et al.  Benchmarking Single-Image Reflection Removal Algorithms , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[52]  Minhui Xue,et al.  Invisible Backdoor Attacks Against Deep Neural Networks , 2019, ArXiv.

[53]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[54]  Brendan Dolan-Gavitt,et al.  Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks , 2018, RAID.

[55]  Baoyuan Wu,et al.  Rethinking the Trigger of Backdoor Attack , 2020, ArXiv.

[56]  James Bailey,et al.  Skip Connections Matter: On the Transferability of Adversarial Examples Generated with ResNets , 2020, ICLR.

[57]  Blaine Nelson,et al.  Poisoning Attacks against Support Vector Machines , 2012, ICML.

[58]  Atul Prakash,et al.  Robust Physical-World Attacks on Machine Learning Models , 2017, ArXiv.

[59]  Atul Prakash,et al.  Robust Physical-World Attacks on Deep Learning Visual Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  ZissermanAndrew,et al.  The Pascal Visual Object Classes Challenge , 2015 .

[61]  Jean-Michel Morel,et al.  Non-Local Means Denoising , 2011, Image Process. Line.

[62]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[63]  James Bailey,et al.  Understanding Adversarial Attacks on Deep Learning Based Medical Image Analysis Systems , 2019, Pattern Recognit..

[64]  Ren Ng,et al.  Single Image Reflection Separation with Perceptual Losses , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[65]  Mohammed Ghanbari,et al.  Scope of validity of PSNR in image/video quality assessment , 2008 .

[66]  Johannes Stallkamp,et al.  The German Traffic Sign Recognition Benchmark: A multi-class classification competition , 2011, The 2011 International Joint Conference on Neural Networks.

[67]  Xiaojin Zhu,et al.  Using Machine Teaching to Identify Optimal Training-Set Attacks on Machine Learners , 2015, AAAI.

[68]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[69]  James Bailey,et al.  Clean-Label Backdoor Attacks on Video Recognition Models , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[71]  Luc Van Gool,et al.  The 2005 PASCAL Visual Object Classes Challenge , 2005, MLCW.

[72]  James Bailey,et al.  Black-box Adversarial Attacks on Video Recognition Models , 2019, ACM Multimedia.

[73]  Dawn Song,et al.  Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[74]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[75]  Cristina Nita-Rotaru,et al.  On the Practicality of Integrity Attacks on Document-Level Sentiment Analysis , 2014, AISec '14.

[76]  Abhishek Das,et al.  Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).