On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method

Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.

[1]  Xiang Gao,et al.  On the Information-Adaptive Variants of the ADMM: An Iteration Complexity Perspective , 2017, Journal of Scientific Computing.

[2]  Jasper Snoek,et al.  Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[3]  s-taiji Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method , 2013 .

[4]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[5]  Martin J. Wainwright,et al.  Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.

[6]  Yanzhi Wang,et al.  Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[7]  Chuang Gan,et al.  Interpreting Adversarial Examples by Activation Promotion and Suppression , 2019, ArXiv.

[8]  Xiao Wang,et al.  Defensive dropout for hardening deep neural networks under adversarial attacks , 2018, ICCAD.

[9]  Fabio Roli,et al.  Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[10]  Lujo Bauer,et al.  Adversarial Generative Nets: Neural Network Attacks on State-of-the-Art Face Recognition , 2018, ArXiv.

[11]  Yanzhi Wang,et al.  ADMM attack: an enhanced adversarial attack for deep neural networks with undetectable distortions , 2019, ASP-DAC.

[12]  Aleksander Madry,et al.  Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors , 2018, ICLR.

[13]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[14]  Nina Narodytska,et al.  Simple Black-Box Adversarial Perturbations for Deep Networks , 2016, ArXiv.

[15]  Jinfeng Yi,et al.  Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[16]  Atul Prakash,et al.  Robust Physical-World Attacks on Machine Learning Models , 2017, ArXiv.

[17]  Nando de Freitas,et al.  Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[18]  Martin J. Wainwright,et al.  Randomized Smoothing for Stochastic Optimization , 2011, SIAM J. Optim..

[19]  Jennifer G. Dy,et al.  ADMMBO: Bayesian Optimization with Unknown Constraints using ADMM , 2019, J. Mach. Learn. Res..

[20]  Paolo Papotti,et al.  Query-limited Black-box Attacks to Classifiers , 2017, ArXiv.

[21]  Xiao Wang,et al.  Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses , 2019, IJCAI.

[22]  Ying Tan,et al.  Black-Box Attacks against RNN based Malware Detection Algorithms , 2017, AAAI Workshops.

[23]  Seyed-Mohsen Moosavi-Dezfooli,et al.  DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Deniz Erdogmus,et al.  Structured Adversarial Attack: Towards General Implementation and Better Interpretability , 2018, ICLR.

[25]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[26]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[27]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[28]  Patrick D. McDaniel,et al.  Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[29]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Song Han,et al.  Defensive Quantization: When Efficiency Meets Robustness , 2019, ICLR.

[31]  Logan Engstrom,et al.  Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[32]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[33]  Dawn Song,et al.  Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[34]  Jinfeng Yi,et al.  AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks , 2018, AAAI.

[35]  A. Hero,et al.  Supplementary Material for Zeroth-Order Online Alternating Direction Method of Multipliers : Convergence Analysis and Applications , 2018 .

[36]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[37]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[38]  Jinfeng Yi,et al.  EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples , 2017, AAAI.

[39]  Sijia Liu,et al.  Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective , 2019, IJCAI.

[40]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[41]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[42]  Dawn Xiaodong Song,et al.  Exploring the Space of Black-box Attacks on Deep Neural Networks , 2017, ArXiv.

[43]  Andreas Krause,et al.  Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation , 2016, NIPS.

[44]  Yanzhi Wang,et al.  An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks , 2018, ACM Multimedia.

[45]  Huichen Lihuichen DECISION-BASED ADVERSARIAL ATTACKS: RELIABLE ATTACKS AGAINST BLACK-BOX MACHINE LEARNING MODELS , 2017 .

[46]  Dawn Xiaodong Song,et al.  Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[47]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Yurii Nesterov,et al.  Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.

[49]  Bhavya Kailkhura,et al.  Universal Decision-Based Black-Box Perturbations: Breaking Security-Through-Obscurity Defenses , 2018, ArXiv.

[50]  Qinghua Liu,et al.  Linearized ADMM for Nonconvex Nonsmooth Optimization With Convergence Analysis , 2017, IEEE Access.

[51]  Simon Haykin,et al.  GradientBased Learning Applied to Document Recognition , 2001 .