论文信息 - On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method - 字舞流文

On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method

Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.

Sijia Liu | Kaidi Xu | Pu Zhao | Bhavya Kailkhura | Pin-Yu Chen | Xue Lin | Nghia Hoang | Pin-Yu Chen | B. Kailkhura | Sijia Liu | Pu Zhao | Xue Lin | Kaidi Xu | Nghia Hoang

[1] Xiang Gao,et al. On the Information-Adaptive Variants of the ADMM: An Iteration Complexity Perspective , 2017, Journal of Scientific Computing.

[2] Jasper Snoek,et al. Practical Bayesian Optimization of Machine Learning Algorithms , 2012, NIPS.

[3] s-taiji. Dual Averaging and Proximal Gradient Descent for Online Alternating Direction Multiplier Method , 2013 .

[4] Stephen P. Boyd,et al. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[5] Martin J. Wainwright,et al. Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations , 2013, IEEE Transactions on Information Theory.

[6] Yanzhi Wang,et al. Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[7] Chuang Gan,et al. Interpreting Adversarial Examples by Activation Promotion and Suppression , 2019, ArXiv.

[8] Xiao Wang,et al. Defensive dropout for hardening deep neural networks under adversarial attacks , 2018, ICCAD.

[9] Fabio Roli,et al. Evasion Attacks against Machine Learning at Test Time , 2013, ECML/PKDD.

[10] Lujo Bauer,et al. Adversarial Generative Nets: Neural Network Attacks on State-of-the-Art Face Recognition , 2018, ArXiv.

[11] Yanzhi Wang,et al. ADMM attack: an enhanced adversarial attack for deep neural networks with undetectable distortions , 2019, ASP-DAC.

[12] Aleksander Madry,et al. Prior Convictions: Black-Box Adversarial Attacks with Bandits and Priors , 2018, ICLR.

[13] Jonathon Shlens,et al. Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[14] Nina Narodytska,et al. Simple Black-Box Adversarial Perturbations for Deep Networks , 2016, ArXiv.

[15] Jinfeng Yi,et al. Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[16] Atul Prakash,et al. Robust Physical-World Attacks on Machine Learning Models , 2017, ArXiv.

[17] Nando de Freitas,et al. Taking the Human Out of the Loop: A Review of Bayesian Optimization , 2016, Proceedings of the IEEE.

[18] Martin J. Wainwright,et al. Randomized Smoothing for Stochastic Optimization , 2011, SIAM J. Optim..

[19] Jennifer G. Dy,et al. ADMMBO: Bayesian Optimization with Unknown Constraints using ADMM , 2019, J. Mach. Learn. Res..

[20] Paolo Papotti,et al. Query-limited Black-box Attacks to Classifiers , 2017, ArXiv.

[21] Xiao Wang,et al. Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses , 2019, IJCAI.

[22] Ying Tan,et al. Black-Box Attacks against RNN based Malware Detection Algorithms , 2017, AAAI Workshops.

[23] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24] Deniz Erdogmus,et al. Structured Adversarial Attack: Towards General Implementation and Better Interpretability , 2018, ICLR.

[25] Geoffrey E. Hinton,et al. Deep Learning , 2015, Nature.

[26] Samy Bengio,et al. Adversarial examples in the physical world , 2016, ICLR.

[27] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[28] Patrick D. McDaniel,et al. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples , 2016, ArXiv.

[29] Sergey Ioffe,et al. Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Song Han,et al. Defensive Quantization: When Efficiency Meets Robustness , 2019, ICLR.

[31] Logan Engstrom,et al. Black-box Adversarial Attacks with Limited Queries and Information , 2018, ICML.

[32] David A. Wagner,et al. Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[33] Dawn Song,et al. Robust Physical-World Attacks on Deep Learning Models , 2017, 1707.08945.

[34] Jinfeng Yi,et al. AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks , 2018, AAAI.

[35] A. Hero,et al. Supplementary Material for Zeroth-Order Online Alternating Direction Method of Multipliers : Convergence Analysis and Applications , 2018 .

[36] Jinfeng Yi,et al. ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[37] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[38] Jinfeng Yi,et al. EAD: Elastic-Net Attacks to Deep Neural Networks via Adversarial Examples , 2017, AAAI.

[39] Sijia Liu,et al. Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective , 2019, IJCAI.

[40] Stephen P. Boyd,et al. Proximal Algorithms , 2013, Found. Trends Optim..

[41] Joan Bruna,et al. Intriguing properties of neural networks , 2013, ICLR.

[42] Dawn Xiaodong Song,et al. Exploring the Space of Black-box Attacks on Deep Neural Networks , 2017, ArXiv.

[43] Andreas Krause,et al. Truncated Variance Reduction: A Unified Approach to Bayesian Optimization and Level-Set Estimation , 2016, NIPS.

[44] Yanzhi Wang,et al. An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks , 2018, ACM Multimedia.

[45] Huichen Lihuichen. DECISION-BASED ADVERSARIAL ATTACKS: RELIABLE ATTACKS AGAINST BLACK-BOX MACHINE LEARNING MODELS , 2017 .

[46] Dawn Xiaodong Song,et al. Delving into Transferable Adversarial Examples and Black-box Attacks , 2016, ICLR.

[47] Fei-Fei Li,et al. ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[48] Yurii Nesterov,et al. Random Gradient-Free Minimization of Convex Functions , 2015, Foundations of Computational Mathematics.

[49] Bhavya Kailkhura,et al. Universal Decision-Based Black-Box Perturbations: Breaking Security-Through-Obscurity Defenses , 2018, ArXiv.

[50] Qinghua Liu,et al. Linearized ADMM for Nonconvex Nonsmooth Optimization With Convergence Analysis , 2017, IEEE Access.

[51] Simon Haykin,et al. GradientBased Learning Applied to Document Recognition , 2001 .