Robust Pre-Processing: A Robust Defense Method Against Adversary Attack

Deep learning algorithms and networks are vulnerable to perturbed inputs which are known as the adversarial attack. Many defense methodologies have been investigated to defend such adversarial attack. In this work, we propose a novel methodology to defend the existing powerful attack model. Such attack models have achieved record success against MNIST dataset to force it to miss-classify all of its inputs. Whereas Our proposed defense method robust pre-processing achieves the best accuracy among the current state of the art defenses. It consists of Tanh (hyperbolic tangent) function, smoothing and batch normalization to process the input data which will make it more robust over the adversarial attack. robust pre-processing improves the white box attack accuracy of MNIST from 94.3% to 98.7%. Even with increasing defense when others defenses completely fail, robust pre-processing remains one of the strongest ever reported. Another strength of our defense is that it eliminates the need for adversarial training as it can significantly increase the MNIST accuracy without adversarial training as well. This makes it a more generalized defense method with almost half training overhead and much-improved accuracy. robust pre-processing can also increase the inference accuracy in the face of the powerful attack on CIFAR-10 and SVHN data set as well without much sacrificing clean data accuracy.

[1]  Geoffrey Zweig,et al.  Achieving Human Parity in Conversational Speech Recognition , 2016, ArXiv.

[2]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[3]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[4]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[5]  Dawn Xiaodong Song,et al.  Delving into adversarial attacks on deep policies , 2017, ICLR.

[6]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[7]  David A. Wagner,et al.  Defensive Distillation is Not Robust to Adversarial Examples , 2016, ArXiv.

[8]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[9]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[10]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Yanjun Qi,et al.  Feature Squeezing: Detecting Adversarial Examples in Deep Neural Networks , 2017, NDSS.

[14]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[15]  Michael P. Wellman,et al.  Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[16]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[17]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[18]  Slav Petrov,et al.  Globally Normalized Transition-Based Neural Networks , 2016, ACL.

[19]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[20]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[21]  Dawn Xiaodong Song,et al.  Adversarial Examples for Generative Models , 2017, 2018 IEEE Security and Privacy Workshops (SPW).

[22]  Prateek Mittal,et al.  Dimensionality Reduction as a Defense against Evasion Attacks on Machine Learning Classifiers , 2017, ArXiv.