An Empirical Study of Derivative-Free-Optimization Algorithms for Targeted Black-Box Attacks in Deep Neural Networks

We perform a comprehensive study on the performance of derivative free optimization (DFO) algorithms for the generation of targeted black-box adversarial attacks on Deep Neural Network (DNN) classifiers assuming the perturbation energy is bounded by an $\ell_\infty$ constraint and the number of queries to the network is limited. This paper considers four pre-existing state-of-the-art DFO-based algorithms along with the introduction of a new algorithm built on BOBYQA, a model-based DFO method. We compare these algorithms in a variety of settings according to the fraction of images that they successfully misclassify given a maximum number of queries to the DNN. The experiments disclose how the likelihood of finding an adversarial example depends on both the algorithm used and the setting of the attack; algorithms limiting the search of adversarial example to the vertices of the $\ell^\infty$ constraint work particularly well without structural defenses, while the presented BOBYQA based algorithm works better for especially small perturbation energies. This variance in performance highlights the importance of new algorithms being compared to the state-of-the-art in a variety of settings, and the effectiveness of adversarial defenses being tested using as wide a range of algorithms as possible.

[1]  Kilian Q. Weinberger,et al.  Low Frequency Adversarial Perturbation , 2018, UAI.

[2]  Salman Khan,et al.  Local Gradients Smoothing: Defense Against Localized Adversarial Attacks , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[3]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[4]  D K Smith,et al.  Numerical Optimization , 2001, J. Oper. Res. Soc..

[5]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Coralia Cartis,et al.  Improving the Flexibility and Robustness of Model-based Derivative-free Optimization Solvers , 2018, ACM Trans. Math. Softw..

[7]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[8]  Jiliang Tang,et al.  Adversarial Attacks and Defenses in Images, Graphs and Text: A Review , 2019, International Journal of Automation and Computing.

[9]  Aleksander Madry,et al.  Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors , 2018, ICLR.

[10]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[11]  Upamanyu Madhow,et al.  Toward Robust Neural Networks via Sparsification , 2018, ArXiv.

[12]  Flagot Yohannes Derivative free optimization methods , 2012 .

[13]  Jinfeng Yi,et al.  EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples , 2017, AAAI.

[14]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[15]  Nina Narodytska,et al.  Simple Black-Box Adversarial Attacks on Deep Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[16]  Mani Srivastava,et al.  GenAttack: practical black-box attacks with gradient-free optimization , 2018, GECCO.

[17]  Logan Engstrom,et al.  Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[18]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[19]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[20]  Xiao Wang,et al.  Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses , 2019, IJCAI.

[21]  John L. Nazareth,et al.  Introduction to derivative-free optimization , 2010, Math. Comput..

[22]  Nicolas Flammarion,et al.  Square Attack: a query-efficient black-box adversarial attack via random search , 2020, ECCV.

[23]  M. Powell The BOBYQA algorithm for bound constrained optimization without derivatives , 2009 .

[24]  Hyun Oh Song,et al.  Parsimonious Black-Box Adversarial Attacks via Efficient Combinatorial Optimization , 2019, ICML.

[25]  Prateek Mittal,et al.  DARTS: Deceiving Autonomous Cars with Toxic Signs , 2018, ArXiv.

[26]  Coralia Cartis,et al.  Can Top-of-Atmosphere Radiation Measurements Constrain Climate Predictions? Part I: Tuning , 2013 .

[27]  Pan He,et al.  Adversarial Examples: Attacks and Defenses for Deep Learning , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[28]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[29]  Marcus A. Brubaker,et al.  On the Effectiveness of Low Frequency Perturbations , 2019, IJCAI.

[30]  Moustapha Cissé,et al.  Countering Adversarial Images using Input Transformations , 2018, ICLR.

[31]  Kamyar Azizzadenesheli,et al.  Stochastic Activation Pruning for Robust Adversarial Defense , 2018, ICLR.

[32]  Atul Prakash,et al.  Robust Physical-World Attacks on Deep Learning Visual Classification , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Jared Tanner,et al.  A Model-Based Derivative-Free Approach to Black-Box Adversarial Examples: BOBYQA , 2020, ArXiv.

[35]  Coralia Cartis,et al.  Scalable Derivative-Free Optimization for Nonlinear Least-Squares Problems , 2020, ArXiv.

[36]  Pedro M. Domingos,et al.  Adversarial classification , 2004, KDD.

[37]  Jinfeng Yi,et al.  A Frank-Wolfe Framework for Efficient and Effective Adversarial Attacks , 2018, AAAI.

[38]  Jinfeng Yi,et al.  AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks , 2018, AAAI.

[39]  Davide Eynard,et al.  Fake News Detection on Social Media using Geometric Deep Learning , 2019, ArXiv.

[40]  Una-May O'Reilly,et al.  There are No Bit Parts for Sign Bits in Black-Box Attacks , 2019, ArXiv.