Towards Efficient Data Free Blackbox Adversarial Attack

Classic black-box adversarial attacks can take advantage of transferable adversarial examples generated by a similar substitute model to successfully fool the target model. However, these substitute models need to be trained by target models' training data, which is hard to acquire due to privacy or transmission reasons. Recognizing the limited availability of real data for adversarial queries, recent works proposed to train substitute models in a data-free black-box scenario. However, their generative adversarial networks (GANs) based framework suffers from the convergence failure and the model collapse, resulting in low efficiency. In this paper, by rethinking the collaborative relationship between the generator and the substitute model, we design a novel black-box attack framework. The proposed method can efficiently imitate the target model through a small number of queries and achieve high attack success rate. The comprehensive experiments over six datasets demonstrate the effectiveness of our method against the state-of-the-art attacks. Especially, we conduct both label-only and probability-only attacks on the Microsoft Azure online model, and achieve a 100% attack success rate with only 0.46% query budget of the SOTA method [49].

[1]  Bo Li,et al.  Towards Practical Certifiable Patch Defense with Vision Transformer , 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Wenqiang Zhang,et al.  Efficient Universal Shuffle Attack for Visual Object Tracking , 2022, ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Chuhan Wu,et al.  FedCTR: Federated Native Ad CTR Prediction with Cross-platform User Behavior Data , 2022, ACM Trans. Intell. Syst. Technol..

[4]  Yang Cong,et al.  Where and How to Transfer: Knowledge Aggregation-Induced Transferability Perception for Unsupervised Domain Adaptation. , 2021, IEEE transactions on pattern analysis and machine intelligence.

[5]  Shouhong Ding,et al.  Highly Efficient Natural Image Matting , 2021, BMVC.

[6]  Jilin Li,et al.  Detecting Adversarial Patch Attacks through Global-local Consistency , 2021, AdvM @ ACM Multimedia.

[7]  Lingjuan Lyu,et al.  A Novel Attribute Reconstruction Attack in Federated Learning , 2021, ArXiv.

[8]  Mofei Song,et al.  Disentangled High Quality Salient Object Detection , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Weisi Lin,et al.  CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes , 2021, AAAI.

[10]  Feiyue Huang,et al.  Delving into Data: Effectively Substitute Training for Black-box Attack , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Philip S. Yu,et al.  Privacy and Robustness in Federated Learning: Attacks and Defenses , 2020, IEEE transactions on neural networks and learning systems.

[12]  Nicolas Papernot,et al.  Data-Free Model Extraction , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Lingjuan Lyu,et al.  How to Democratise and Protect AI: Fair and Differentially Private Decentralised Deep Learning , 2020, IEEE Transactions on Dependable and Secure Computing.

[14]  Jihwan P. Choi,et al.  Data-Free Network Quantization With Adversarial Knowledge Distillation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[15]  Xiaowei Xu,et al.  What Can Be Transferred: Unsupervised Domain Adaptation for Endoscopic Lesions Segmentation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yipeng Liu,et al.  DaST: Data-Free Substitute Training for Adversarial Attacks , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Xinchao Wang,et al.  Data-Free Adversarial Distillation , 2019, ArXiv.

[18]  Tong Zhang,et al.  Black-Box Adversarial Attack with Transferable Model-based Embedding , 2019, ICLR.

[19]  Zhengxing Sun,et al.  Co-saliency Detection Based on Hierarchical Consistency , 2019, ACM Multimedia.

[20]  Bo Li,et al.  Detecting Robust Co-Saliency with Recurrent Co-Attention Neural Network , 2019, IJCAI.

[21]  Zhengxing Sun,et al.  SuperVAE: Superpixelwise Variational Autoencoder for Salient Object Detection , 2019, AAAI.

[22]  Jun Zhu,et al.  Improving Black-box Adversarial Attacks with a Transfer-based Prior , 2019, NeurIPS.

[23]  Amos Storkey,et al.  Zero-shot Knowledge Transfer via Adversarial Belief Matching , 2019, NeurIPS.

[24]  Michael I. Jordan,et al.  HopSkipJumpAttack: A Query-Efficient Decision-Based Attack , 2019, 2020 IEEE Symposium on Security and Privacy (SP).

[25]  Yahong Han,et al.  Curls & Whey: Boosting Black-Box Adversarial Attacks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Timo Aila,et al.  A Style-Based Generator Architecture for Generative Adversarial Networks , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Tribhuvanesh Orekondy,et al.  Knockoff Nets: Stealing Functionality of Black-Box Models , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jinfeng Yi,et al.  Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach , 2018, ICLR.

[29]  Matthias Bethge,et al.  Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models , 2017, ICLR.

[30]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[31]  Jinfeng Yi,et al.  ZOO: Zeroth Order Optimization Based Black-box Attacks to Deep Neural Networks without Training Substitute Models , 2017, AISec@CCS.

[32]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[33]  Samy Bengio,et al.  Adversarial Machine Learning at Scale , 2016, ICLR.

[34]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[35]  Samy Bengio,et al.  Adversarial examples in the physical world , 2016, ICLR.

[36]  Ananthram Swami,et al.  Practical Black-Box Attacks against Deep Learning Systems using Adversarial Examples , 2016, ArXiv.

[37]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[40]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[41]  Joan Bruna,et al.  Intriguing properties of neural networks , 2013, ICLR.

[42]  Ya Le,et al.  Tiny ImageNet Visual Recognition Challenge , 2015 .

[43]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[44]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[45]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[46]  Bo Li Group-Wise Deep Object Co-Segmentation With Co-Attention Recurrent Neural Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Chuhan Wu,et al.  Communication-efficient federated learning via knowledge distillation , 2021, Nature Communications.