Adversarial Examples in Deep Learning: Characterization and Divergence

The burgeoning success of deep learning has raised the security and privacy concerns as more and more tasks are accompanied with sensitive data. Adversarial attacks in deep learning have emerged as one of the dominating security threat to a range of mission-critical deep learning systems and applications. This paper takes a holistic and principled approach to perform statistical characterization of adversarial examples in deep learning. We provide a general formulation of adversarial examples and elaborate on the basic principle for adversarial attack algorithm design. We introduce easy and hard categorization of adversarial attacks to analyze the effectiveness of adversarial examples in terms of attack success rate, degree of change in adversarial perturbation, average entropy of prediction qualities, and fraction of adversarial examples that lead to successful attacks. We conduct extensive experimental study on adversarial behavior in easy and hard attacks under deep learning models with different hyperparameters and different deep learning frameworks. We show that the same adversarial attack behaves differently under different hyperparameters and across different frameworks due to the different features learned under different deep learning model training process. Our statistical characterization with strong empirical evidence provides a transformative enlightenment on mitigation strategies towards effective countermeasures against present and future adversarial attacks.

[1]  Michael P. Wellman,et al.  Towards the Science of Security and Privacy in Machine Learning , 2016, ArXiv.

[2]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[3]  Jason Yosinski,et al.  Deep neural networks are easily fooled: High confidence predictions for unrecognizable images , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Blaine Nelson,et al.  Can machine learning be secure? , 2006, ASIACCS '06.

[5]  Ananthram Swami,et al.  Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks , 2015, 2016 IEEE Symposium on Security and Privacy (SP).

[6]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[7]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition , 2012 .

[8]  Deliang Fan,et al.  Blind Pre-Processing: A Robust Defense Method Against Adversarial Examples , 2018, ArXiv.

[9]  Chia-Mu Yu,et al.  On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples , 2018, ICLR.

[10]  Atul Prakash,et al.  Robust Physical-World Attacks on Machine Learning Models , 2017, ArXiv.

[11]  Dan Boneh,et al.  The Space of Transferable Adversarial Examples , 2017, ArXiv.

[12]  Elaine Shi,et al.  Hawk: The Blockchain Model of Cryptography and Privacy-Preserving Smart Contracts , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[13]  Lujo Bauer,et al.  Accessorize to a Crime: Real and Stealthy Attacks on State-of-the-Art Face Recognition , 2016, CCS.

[14]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[15]  Michael P. Wellman,et al.  SoK: Security and Privacy in Machine Learning , 2018, 2018 IEEE European Symposium on Security and Privacy (EuroS&P).

[16]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[17]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Blaine Nelson,et al.  Adversarial machine learning , 2019, AISec '11.

[19]  Gahyun Park,et al.  A Generalization of Multiple Choice Balls-into-Bins: Tight Bounds , 2012, Algorithmica.

[20]  Hao Chen,et al.  MagNet: A Two-Pronged Defense against Adversarial Examples , 2017, CCS.

[21]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[22]  Richa Singh,et al.  Unravelling Robustness of Deep Learning based Face Recognition Against Adversarial Attacks , 2018, AAAI.

[23]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[24]  Deliang Fan,et al.  Robust Pre-Processing: A Robust Defense Method Against Adversary Attack , 2018 .

[25]  Ramesh K. Sitaraman,et al.  The power of two random choices: a survey of tech-niques and results , 2001 .

[26]  Dawn Song,et al.  Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[27]  Jascha Sohl-Dickstein,et al.  Adversarial Examples that Fool both Human and Computer Vision , 2018, ArXiv.

[28]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[29]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[30]  Jan Hendrik Metzen,et al.  On Detecting Adversarial Perturbations , 2017, ICLR.

[31]  Jack W. Stokes,et al.  Large-scale malware classification using random projections and neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[33]  Muhammad Ejaz Ahmed,et al.  Poster: Adversarial Examples for Classifiers in High-Dimensional Network Data , 2017, CCS.

[34]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Jiman Kim,et al.  End-To-End Ego Lane Estimation Based on Sequential Transfer Learning for Self-Driving Cars , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[36]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[37]  Valentina Zantedeschi,et al.  Efficient Defenses Against Adversarial Attacks , 2017, AISec@CCS.

[38]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[39]  James Bailey,et al.  Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality , 2018, ICLR.

[40]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[41]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[42]  Kyunghyun Cho,et al.  Retrieval-Augmented Convolutional Neural Networks for Improved Robustness against Adversarial Examples , 2018, ArXiv.

[43]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[44]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[45]  Micah Sherr,et al.  Hidden Voice Commands , 2016, USENIX Security Symposium.

[46]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[47]  Yanzhao Wu,et al.  Benchmarking Deep Learning Frameworks: Design Considerations, Metrics and Beyond , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[48]  Alan L. Yuille,et al.  Adversarial Examples for Semantic Segmentation and Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Xiaoyu Cao,et al.  Mitigating Evasion Attacks to Deep Neural Networks via Region-based Classification , 2017, ACSAC.