Are Generative Classifiers More Robust to Adversarial Attacks?

There is a rising interest in studying the robustness of deep neural network classifiers against adversaries, with both advanced attack and defence techniques being actively developed. However, most recent work focuses on discriminative classifiers which only models the conditional distribution of the labels given the inputs. In this abstract we propose deep Bayes classifier that improves the classical naive Bayes with conditional deep generative models, and verifies its robustness against a number of existing attacks. We further developed a detection method for adversarial examples based on conditional deep generative models. Our initial results on MNIST suggest that deep Bayes classifiers might be more robust when compared with deep discriminative classifiers, and the proposed detection method achieves high detection rates against two commonly used attacks.

[1]  Koray Kavukcuoglu,et al.  Pixel Recurrent Neural Networks , 2016, ICML.

[2]  Fabio Roli,et al.  Design of robust classifiers for adversarial environments , 2011, 2011 IEEE International Conference on Systems, Man, and Cybernetics.

[3]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[4]  Yao Zhao,et al.  Adversarial Attacks and Defences Competition , 2018, ArXiv.

[5]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[6]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[7]  Leon A. Gatys,et al.  Image Style Transfer Using Convolutional Neural Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jun Zhu,et al.  Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[10]  Anil A. Bharath,et al.  LatentPoison - Adversarial Attacks On The Latent Space , 2017, ArXiv.

[11]  Yarin Gal,et al.  Understanding Measures of Uncertainty for Adversarial Example Detection , 2018, UAI.

[12]  Xi Chen,et al.  PixelCNN++: Improving the PixelCNN with Discretized Logistic Mixture Likelihood and Other Modifications , 2017, ICLR.

[13]  Thomas Brox,et al.  Generating Images with Perceptual Similarity Metrics based on Deep Networks , 2016, NIPS.

[14]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[15]  Dan Boneh,et al.  Ensemble Adversarial Training: Attacks and Defenses , 2017, ICLR.

[16]  Yee Whye Teh,et al.  Do Deep Generative Models Know What They Don't Know? , 2018, ICLR.

[17]  Max Welling,et al.  Multiplicative Normalizing Flows for Variational Bayesian Neural Networks , 2017, ICML.

[18]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[19]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[20]  Bernhard Schölkopf,et al.  On causal and anticausal learning , 2012, ICML.

[21]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[22]  Mani Srivastava,et al.  GenAttack: practical black-box attacks with gradient-free optimization , 2018, GECCO.

[23]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[24]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[25]  Ole Winther,et al.  Autoencoding beyond pixels using a learned similarity metric , 2015, ICML.

[26]  Ananthram Swami,et al.  The Limitations of Deep Learning in Adversarial Settings , 2015, 2016 IEEE European Symposium on Security and Privacy (EuroS&P).

[27]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[29]  Fabio Roli,et al.  Security Evaluation of Pattern Classifiers under Attack , 2014, IEEE Transactions on Knowledge and Data Engineering.

[30]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Dawn Xiaodong Song,et al.  Adversarial Examples for Generative Models , 2017, 2018 IEEE Security and Privacy Workshops (SPW).

[32]  Michael I. Jordan,et al.  On Discriminative vs. Generative Classifiers: A comparison of logistic regression and naive Bayes , 2001, NIPS.

[33]  Luca Rigazio,et al.  Towards Deep Neural Network Architectures Robust to Adversarial Examples , 2014, ICLR.

[34]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[35]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[36]  David Wagner,et al.  Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[37]  Xiaolin Hu,et al.  Discovering Adversarial Examples with Momentum , 2017, ArXiv.

[38]  David A. Wagner,et al.  Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples , 2018, ICML.

[39]  Patrick D. McDaniel,et al.  On the (Statistical) Detection of Adversarial Examples , 2017, ArXiv.

[40]  Yarin Gal,et al.  Dropout Inference in Bayesian Neural Networks with Alpha-divergences , 2017, ICML.

[41]  Eduardo Valle,et al.  Adversarial Images for Variational Autoencoders , 2016, ArXiv.

[42]  Daan Wierstra,et al.  Stochastic Backpropagation and Approximate Inference in Deep Generative Models , 2014, ICML.

[43]  Jinfeng Yi,et al.  EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples , 2017, AAAI.

[44]  B. Efron The Efficiency of Logistic Regression Compared to Normal Discriminant Analysis , 1975 .

[45]  Mani B. Srivastava,et al.  Generating Natural Language Adversarial Examples , 2018, EMNLP.

[46]  Pushmeet Kohli,et al.  Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[47]  Matthias Bethge,et al.  Towards the first adversarially robust neural network model on MNIST , 2018, ICLR.

[48]  Patrick D. McDaniel,et al.  Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning , 2018, ArXiv.

[49]  Patrick D. McDaniel,et al.  Cleverhans V0.1: an Adversarial Machine Learning Library , 2016, ArXiv.

[50]  Rama Chellappa,et al.  Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models , 2018, ICLR.

[51]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[52]  Alex Graves,et al.  Conditional Image Generation with PixelCNN Decoders , 2016, NIPS.

[53]  Ryan R. Curtin,et al.  Detecting Adversarial Samples from Artifacts , 2017, ArXiv.

[54]  Jaakko Lehtinen,et al.  Progressive Growing of GANs for Improved Quality, Stability, and Variation , 2017, ICLR.

[55]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[56]  Yang Song,et al.  PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples , 2017, ICLR.

[57]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[58]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  J. Doug Tygar,et al.  Adversarial machine learning , 2019, AISec '11.

[60]  Alexander A. Alemi,et al.  Deep Variational Information Bottleneck , 2017, ICLR.

[61]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).