TAG: Transformer Attack from Gradient

Although federated learning has increasingly gained attention in terms of effectively utilizing local devices for data privacy enhancement, recent studies show that publicly shared gradients in the training process can reveal the private training images (gradient leakage) to a third-party in computer vision. We have, however, no systematic understanding of the gradient leakage mechanism on the Transformer based language models. In this paper, as the first attempt, we formulate the gradient attack problem on the Transformer-based language models and propose a gradient attack algorithm, TAG, to reconstruct the local training data. We develop a set of metrics to evaluate the effectiveness of the proposed attack algorithm quantitatively. Experimental results on Transformer, TinyBERT4, TinyBERT6, BERTBASE , and BERTLARGE using GLUE benchmark show that TAG works well on more weight distributions in reconstructing training data and achieves 1.5× recover rate and 2.5× ROUGE-2 over prior methods without the need of ground truth label. TAG can obtain up to 90% data by attacking gradients in CoLA dataset. In addition, TAG has a stronger adversary on large models, small dictionary size, and small input length. We hope the proposed TAG will shed some light on the privacy leakage problem in Transformer-based NLP models.

[1]  Michael Moeller,et al.  Inverting Gradients - How easy is it to break privacy in federated learning? , 2020, NeurIPS.

[2]  Alexander J. Smola,et al.  Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.

[3]  Omer Levy,et al.  GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding , 2018, BlackboxNLP@EMNLP.

[4]  Si Chen,et al.  Improved Techniques for Model Inversion Attacks , 2021, ArXiv.

[5]  Jianfeng Gao,et al.  UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training , 2020, ICML.

[6]  Qun Liu,et al.  TinyBERT: Distilling BERT for Natural Language Understanding , 2020, EMNLP.

[7]  Alexander J. Smola,et al.  Communication Efficient Distributed Machine Learning with the Parameter Server , 2014, NIPS.

[8]  Tassilo Klein,et al.  Differentially Private Federated Learning: A Client Level Perspective , 2017, ArXiv.

[9]  Moran Baruch,et al.  A Little Is Enough: Circumventing Defenses For Distributed Learning , 2019, NeurIPS.

[10]  Kevin Gimpel,et al.  Gaussian Error Linear Units (GELUs) , 2016 .

[11]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[12]  Hubert Eichner,et al.  Federated Learning for Mobile Keyboard Prediction , 2018, ArXiv.

[13]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[14]  Yiming Yang,et al.  On the Sentence Embeddings from BERT for Semantic Textual Similarity , 2020, EMNLP.

[15]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[16]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[17]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[18]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[19]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[20]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[21]  Wanxiang Che,et al.  Pre-Training with Whole Word Masking for Chinese BERT , 2019, ArXiv.

[22]  Samuel R. Bowman,et al.  Neural Network Acceptability Judgments , 2018, Transactions of the Association for Computational Linguistics.

[23]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[24]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[25]  Andrey Kutuzov,et al.  Vec2graph: A Python Library for Visualizing Word Embeddings as Graphs , 2019, AIST.

[26]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[27]  Panos M. Pardalos,et al.  Analysis of Images, Social Networks and Texts , 2014, Communications in Computer and Information Science.

[28]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[29]  Ian S. Dunn,et al.  Exploring the Limits , 2009 .

[30]  Colin Raffel,et al.  Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer , 2019, J. Mach. Learn. Res..

[31]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[32]  Song Han,et al.  Deep Leakage from Gradients , 2019, NeurIPS.

[33]  H. Brendan McMahan,et al.  Training Production Language Models without Memorizing User Data , 2020, ArXiv.

[34]  Hubert Eichner,et al.  Towards Federated Learning at Scale: System Design , 2019, SysML.

[35]  Dan Guo,et al.  SAPAG: A Self-Adaptive Privacy Attack From Gradients , 2020, ArXiv.

[36]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[37]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[38]  Pradeep Dubey,et al.  Distributed Deep Learning Using Synchronous Stochastic Gradient Descent , 2016, ArXiv.

[39]  Christopher Potts,et al.  Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank , 2013, EMNLP.

[40]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[41]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2021, Found. Trends Mach. Learn..

[42]  Somesh Jha,et al.  Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures , 2015, CCS.

[43]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.