Securing Deep Neural Nets against Adversarial Attacks with Moving Target Defense

Deep Neural Networks (DNNs) are presently the state-of-the-art for image classification tasks. However, recent works have shown that these systems can be easily fooled to misidentify images by modifying the image in particular ways, often rendering them practically useless. Moreover, defense mechanisms proposed in the literature so far are mostly attack-specific and prove to be ineffective against new attacks. Indeed, recent work on universal perturbations can generate a single modification for all test images that is able to make existing networks misclassify 90% of the time. Presently, to our knowledge, no defense mechanisms are effective in preventing this. As such, the design of a general defense strategy against a wide range of attacks for Neural Networks becomes a challenging problem. In this paper, we derive inspiration from recent advances in the field of cybersecurity and multi-agent systems, and propose to use the concept of Moving Target Defense (MTD) for increasing the robustness of well-known deep networks trained on the ImageNet dataset towards such adversarial attacks. In using this technique, we formalize and exploit the notion of differential immunity of different networks to specific attacks. To classify a single test image, we pick one of the trained networks each time and then use its classification output. To ensure maximum robustness, we generate an effective strategy by formulating this interaction as a Repeated Bayesian Stackelberg Game (BSG) with a Defender (who hosts the classification networks) and Users (both Legitimate users and Attackers). As a network switching strategy, we compute a Strong Stackelberg Equilibrium that optimizes the accuracy of prediction while at the same time reduces the misclassification rate on adversarial modification of test images. We show that while our approach produces an accuracy of 92.79% for the legitimate users, attackers can only misclassify images 58% (instead of 93.7%) of the time even when they select the best attack available to them. This is at least twice as good, to sometimes even an order of magnitude better, compared the accuracy rates of the worst affected networks.

[1]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[2]  Lujo Bauer,et al.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[3]  Fabio Roli,et al.  Adversarial Pattern Classification Using Multiple Classifiers and Randomisation , 2008, SSPR/SPR.

[4]  R. Jayadevan,et al.  Automatic processing of handwritten bank cheque images: a survey , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[5]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[6]  Luis Moreno,et al.  Road traffic sign detection and classification , 1997, IEEE Trans. Ind. Electron..

[7]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[8]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9]  Sarit Kraus,et al.  Playing games for security: an efficient exact algorithm for solving Bayesian Stackelberg games , 2008, AAMAS.

[10]  Mubarak Shah,et al.  Tracking and Object Classification for Automated Surveillance , 2002, ECCV.

[11]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[12]  Sailik Sengupta,et al.  A Game Theoretic Approach to Strategy Generation for Moving Target Defense in Web Applications , 2017, AAMAS.

[13]  Marthony Taguinod,et al.  Toward a Moving Target Defense for Web Applications , 2015, 2015 IEEE International Conference on Information Reuse and Integration.

[14]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[15]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Perturbations , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[17]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[18]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Christian Gagné,et al.  Robustness to Adversarial Examples through an Ensemble of Specialists , 2017, ICLR.

[20]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Scott A. DeLoach,et al.  Towards a Theory of Moving Target Defense , 2014, MTD '14.