Random Feature Nullification for Adversary Resistant Deep Architecture

Deep neural networks (DNN) have been proven to be quite effective in many applications such as image recognition and using software to process security or traffic camera footage, for example to measure traffic flows or spot suspicious activities. Despite the superior performance of DNN in these applications, it has recently been shown that a DNN is susceptible to a particular type of attack that exploits a fundamental flaw in its design. Specifically, an attacker can craft a particular synthetic example, referred to as an adversarial sample, causing the DNN to produce an output behavior chosen by attackers, such as misclassification. Addressing this flaw is critical if a DNN is to be used in critical applications such as those in cybersecurity. Previous work provided various defense mechanisms by either increasing the model nonlinearity or enhancing model complexity. However, after a thorough analysis of the fundamental flaw in the DNN, we discover that the effectiveness of such methods is limited. As such, we propose a new adversary resistant technique that obstructs attackers from constructing impactful adversarial samples by randomly nullifying features within samples. Using the MNIST dataset, we evaluate our proposed technique and empirically show our technique significantly boosts DNN’s robustness against adversarial samples while maintaining high accuracy in classi-

[1]  Yichuan Tang,et al.  Learning Deep Convolutional Neural Networks for X-Ray Protein Crystallization Image Analysis , 2016, AAAI.

[2]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[3]  Pascal Vincent,et al.  Contractive Auto-Encoders: Explicit Invariance During Feature Extraction , 2011, ICML.

[4]  Sida I. Wang,et al.  Dropout Training as Adaptive Regularization , 2013, NIPS.

[5]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[6]  Daniel Kifer,et al.  Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization , 2016, ArXiv.

[7]  Yuri Malitsky,et al.  Deep Learning for Algorithm Portfolios , 2016, AAAI.

[8]  Arild Nøkland Improving Back-Propagation by Adding an Adversarial Gradient , 2015, ArXiv.

[9]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[10]  Shin Ishii,et al.  Distributional Smoothing with Virtual Adversarial Training , 2015, ICLR 2016.

[11]  Yann LeCun,et al.  Scene parsing with Multiscale Feature Learning, Purity Trees, and Optimal Covers , 2012, ICML.

[12]  David A. Wagner,et al.  Defensive Distillation is Not Robust to Adversarial Examples , 2016, ArXiv.

[13]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[14]  Dale Schuurmans,et al.  Learning with a Strong Adversary , 2015, ArXiv.

[15]  Dacheng Tao,et al.  Shakeout: A New Regularized Deep Neural Network Training Scheme , 2016, AAAI.

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Hayit Greenspan,et al.  Deep learning with non-medical training used for chest pathology identification , 2015, Medical Imaging.

[18]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[19]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[20]  Dawn Xiaodong Song,et al.  Recognizing Functions in Binaries with Neural Networks , 2015, USENIX Security Symposium.