Static deep neural network analysis for robustness

This work studies the static structure of deep neural network models using white box based approach and utilizes that knowledge to find the susceptible classes which can be misclassified easily. With the knowledge of susceptible classes, our work has proposed to retrain the model for those classes to achieve increased robustness. Our preliminary result has been evaluated on MNIST, F-MNIST, and CIFAR-10 (ImageNet and ResNet-32 model) based datasets and have been compared with two state-of-the-art detectors.

[1]  Bram van Ginneken,et al.  A survey on deep learning in medical image analysis , 2017, Medical Image Anal..

[2]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[3]  Matthias Bethge,et al.  Foolbox v0.8.0: A Python toolbox to benchmark the robustness of machine learning models , 2017, ArXiv.

[4]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Hans-Jörg Schek,et al.  A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces , 1998, VLDB.

[6]  W. Brendel,et al.  Foolbox: A Python toolbox to benchmark the robustness of machine learning models , 2017 .

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Hamza Fawzi,et al.  Adversarial vulnerability for any classifier , 2018, NeurIPS.

[9]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[10]  Cho-Jui Hsieh,et al.  Towards Robust Neural Networks via Random Self-ensemble , 2017, ECCV.

[11]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[12]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .

[13]  Ying Cai,et al.  Detecting Adversarial Examples Through Image Transformation , 2018, AAAI.

[14]  Somesh Jha,et al.  Objective Metrics and Gradient Descent Algorithms for Adversarial Examples in Machine Learning , 2017, ACSAC.

[15]  Richa Singh,et al.  Unravelling Robustness of Deep Learning based Face Recognition Against Adversarial Attacks , 2018, AAAI.

[16]  Colin Raffel,et al.  Thermometer Encoding: One Hot Way To Resist Adversarial Examples , 2018, ICLR.

[17]  Nina Narodytska,et al.  Simple Black-Box Adversarial Attacks on Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[18]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[19]  Nina Narodytska,et al.  Simple Black-Box Adversarial Perturbations for Deep Networks , 2016, ArXiv.

[20]  Eduardo Valle,et al.  Exploring the space of adversarial images , 2015, 2016 International Joint Conference on Neural Networks (IJCNN).

[21]  Jingyi Wang,et al.  Adversarial Sample Detection for Deep Neural Network through Model Mutation Testing , 2018, 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE).

[22]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[23]  Xiangyu Zhang,et al.  Attacks Meet Interpretability: Attribute-steered Detection of Adversarial Samples , 2018, NeurIPS.

[24]  Aleksander Madry,et al.  A Rotation and a Translation Suffice: Fooling CNNs with Simple Transformations , 2017, ArXiv.

[25]  Jun Zhu,et al.  Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Dale Schuurmans,et al.  Learning with a Strong Adversary , 2015, ArXiv.

[27]  Jiye Liang,et al.  The Information Entropy, Rough Entropy And Knowledge Granulation In Rough Set Theory , 2004, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[28]  Yvan Saeys,et al.  Lower bounds on the robustness to adversarial perturbations , 2017, NIPS.

[29]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[30]  Jun Zhu,et al.  Towards Robust Detection of Adversarial Examples , 2017, NeurIPS.

[31]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[32]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[33]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[34]  Seyed-Mohsen Moosavi-Dezfooli,et al.  SparseFool: A Few Pixels Make a Big Difference , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[36]  Matthias Bethge,et al.  Towards the first adversarially robust neural network model on MNIST , 2018, ICLR.

[37]  Changshui Zhang,et al.  Deep Defense: Training DNNs with Improved Adversarial Robustness , 2018, NeurIPS.

[38]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[39]  Jianxiong Xiao,et al.  DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[41]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[42]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Luiz Eduardo Soares de Oliveira,et al.  Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Giovanni S. Alberti,et al.  ADef: an Iterative Algorithm to Construct Adversarial Deformations , 2018, ICLR.