FaceGuard: A Self-Supervised Defense Against Adversarial Face Images

Prevailing defense mechanisms against adversarial face images tend to overfit to the adversarial perturbations in the training set and fail to generalize to unseen adversarial attacks. We propose a new self-supervised adversarial defense framework, namely FaceGuard, that can automatically detect, localize, and purify a wide variety of adversarial faces without utilizing pre-computed adversarial training samples. During training, FaceGuard automatically synthesizes challenging and diverse adversarial attacks, enabling a classifier to learn to distinguish them from real faces and a purifier attempts to remove the adversarial perturbations in the image space. Experimental results on LFW dataset show that FaceGuard can achieve 99.81% detection accuracy on six unseen adversarial attack types. In addition, the proposed method can enhance the face recognition performance of ArcFace from 34.27% TAR @ 0.1% FAR under no defense to 77.46% TAR @ 0.1% FAR.

[1]  Zhitao Gong,et al.  Adversarial and Clean Data Are Not Twins , 2017, aiDM@SIGMOD.

[2]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Cho-Jui Hsieh,et al.  Rob-GAN: Generator, Discriminator, and Adversarial Attacker , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[5]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[6]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[8]  Fabrizio Falchi,et al.  Detection of Face Recognition Adversarial Attacks , 2021, Comput. Vis. Image Underst..

[9]  Xin Li,et al.  Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[11]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[12]  Mingyan Liu,et al.  Generating Adversarial Examples with Adversarial Networks , 2018, IJCAI.

[13]  Matthias Hein,et al.  Logit Pairing Methods Can Fool Gradient-Based Attacks , 2018, ArXiv.

[14]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[15]  Anil K. Jain,et al.  AdvFaces: Adversarial Face Synthesis , 2019, 2020 IEEE International Joint Conference on Biometrics (IJCB).

[16]  Seunghoon Hong,et al.  Diversity-Sensitive Conditional Generative Adversarial Networks , 2019, ICLR.

[17]  Hao Li,et al.  Protecting World Leaders Against Deep Fakes , 2019, CVPR Workshops.

[18]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[19]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Kamyar Azizzadenesheli,et al.  Stochastic Activation Pruning for Robust Adversarial Defense , 2018, ICLR.

[21]  Ben Y. Zhao,et al.  Fawkes: Protecting Privacy against Unauthorized Deep Learning Models , 2020, USENIX Security Symposium.

[22]  Jinfeng Yi,et al.  Is Robustness the Cost of Accuracy? - A Comprehensive Study on the Robustness of 18 Deep Image Classification Models , 2018, ECCV.

[23]  Harini Kannan,et al.  Adversarial Logit Pairing , 2018, NIPS 2018.

[24]  Honglak Lee,et al.  SemanticAdv: Generating Adversarial Examples via Attribute-conditional Image Editing , 2019, ECCV.

[25]  Fahad Shahbaz Khan,et al.  A Self-supervised Approach for Adversarial Robustness , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[27]  Nasser M. Nasrabadi,et al.  Fast Geometrically-Perturbed Adversarial Faces , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[28]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[29]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[30]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[31]  Valentina Zantedeschi,et al.  Efficient Defenses Against Adversarial Attacks , 2017, AISec@CCS.

[32]  Marwan Mattar,et al.  Labeled Faces in the Wild: A Database forStudying Face Recognition in Unconstrained Environments , 2008 .

[33]  Guneet Singh Dhillon,et al.  TOCHASTIC ACTIVATION PRUNING FOR ROBUST ADVERSARIAL DEFENSE , 2018 .

[34]  Richa Singh,et al.  SmartBox: Benchmarking Adversarial Detection and Mitigation Algorithms for Face Recognition , 2018, 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[35]  Alan L. Yuille,et al.  Mitigating adversarial effects through randomization , 2017, ICLR.

[36]  Saibal Mukhopadhyay,et al.  Cascade Adversarial Machine Learning Regularized with a Unified Embedding , 2017, ICLR.

[37]  Wei Liu,et al.  Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[39]  Shengcai Liao,et al.  Learning Face Representation from Scratch , 2014, ArXiv.

[40]  Kevin Gimpel,et al.  Early Methods for Detecting Adversarial Images , 2016, ICLR.

[41]  Yu Qiao,et al.  Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks , 2016, IEEE Signal Processing Letters.

[42]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[43]  Richa Singh,et al.  Are Image-Agnostic Universal Adversarial Perturbations for Face Recognition Difficult to Detect? , 2018, 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS).

[44]  Patrick J. Grother,et al.  Ongoing Face Recognition Vendor Test (FRVT) Part 2: Identification , 2018 .

[45]  Aditi Raghunathan,et al.  Adversarial Training Can Hurt Generalization , 2019, ArXiv.

[46]  Seunghoon Hong,et al.  Adversarial Defense via Learning to Generate Diverse Attacks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  David A. Wagner,et al.  MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples , 2017, ArXiv.

[48]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Aleksander Madry,et al.  Robustness May Be at Odds with Accuracy , 2018, ICLR.

[50]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[51]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[52]  Patrick Grother,et al.  Face Recognition Vendor Test (FRVT) , 2014 .