论文信息 - Evaluating the Robustness of Bayesian Neural Networks Against Different Types of Attacks

Evaluating the Robustness of Bayesian Neural Networks Against Different Types of Attacks

To evaluate the robustness gain of Bayesian neural networks on image classification tasks, we perform input perturbations, and adversarial attacks to the state-of-the-art Bayesian neural networks, with a benchmark CNN model as reference. The attacks are selected to simulate signal interference and cyberattacks towards CNN-based machine learning systems. The result shows that a Bayesian neural network achieves significantly higher robustness against adversarial attacks generated against a deterministic neural network model, without adversarial training. The Bayesian posterior can act as the safety precursor of ongoing malicious activities. Furthermore, we show that the stochastic classifier after the deterministic CNN extractor has sufficient robustness enhancement rather than a stochastic feature extractor before the stochastic classifier. This advises on utilizing stochastic layers in building decision-making pipelines within a safety-critical domain.

[1] Aleksander Madry,et al. Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[2] Suman Jana,et al. DeepTest: Automated Testing of Deep-Neural-Network-Driven Autonomous Cars , 2017, 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE).

[3] Hang Su,et al. Benchmarking Adversarial Robustness on Image Classification , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Yi Yang,et al. Random Erasing Data Augmentation , 2017, AAAI.

[5] Nicolas Flammarion,et al. Square Attack: a query-efficient black-box adversarial attack via random search , 2020, ECCV.

[6] Aleksander Madry,et al. On Evaluating Adversarial Robustness , 2019, ArXiv.

[7] Luca Cardelli,et al. Statistical Guarantees for the Robustness of Bayesian Neural Networks , 2019, IJCAI.

[8] Dmitry P. Vetrov,et al. Variational Dropout Sparsifies Deep Neural Networks , 2017, ICML.

[9] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[10] Dustin Tran,et al. Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches , 2018, ICLR.

[11] Pushmeet Kohli,et al. Adversarial Risk and the Dangers of Evaluating Against Weak Attacks , 2018, ICML.

[12] David Wagner,et al. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods , 2017, AISec@CCS.

[13] Inderjit S. Dhillon,et al. The Limitations of Adversarial Training and the Blind-Spot Attack , 2019, ICLR.

[14] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[15] Ziad A. Alqadi,et al. Salt and Pepper Noise: Effects and Removal , 2018, JOIV : International Journal on Informatics Visualization.

[16] Jun Zhu,et al. Boosting Adversarial Attacks with Momentum , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Julien Cornebise,et al. Weight Uncertainty in Neural Network , 2015, ICML.

[18] Hang Su,et al. Towards Interpretable Deep Neural Networks by Leveraging Adversarial Examples , 2017, ArXiv.

[19] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Max Welling,et al. Auto-Encoding Variational Bayes , 2013, ICLR.

[21] Ariel D. Procaccia,et al. Variational Dropout and the Local Reparameterization Trick , 2015, NIPS.

[22] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[23] Johannes Stallkamp,et al. Detection of traffic signs in real-world images: The German traffic sign detection benchmark , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[24] Chirag Agarwal,et al. Improving Robustness to Adversarial Examples by Encouraging Discriminative Features , 2019, 2019 IEEE International Conference on Image Processing (ICIP).