Benchmarking Differential Privacy and Federated Learning for BERT Models

Natural Language Processing (NLP) techniques can be applied to help with the diagnosis of medical conditions such as depression, using a collection of a person's utterances. Depression is a serious medical illness that can have adverse effects on how one feels, thinks, and acts, which can lead to emotional and physical problems. Due to the sensitive nature of such data, privacy measures need to be taken for handling and training models with such data. In this work, we study the effects that the application of Differential Privacy (DP) has, in both a centralized and a Federated Learning (FL) setup, on training contextualized language models (BERT, ALBERT, RoBERTa and DistilBERT). We offer insights on how to privately train NLP models and what architectures and setups provide more desirable privacy utility trade-offs. We envisage this work to be used in future healthcare and mental health studies to keep medical history private. Therefore, we provide an open-source implementation of this work.

[1]  Abhik Jana,et al.  An Investigation towards Differentially Private Sequence Tagging in a Federated Framework , 2021, PRIVATENLP.

[2]  Huseyin A. Inan,et al.  Privacy Regularization: Joint Privacy-Utility Optimization in LanguageModels , 2021, NAACL.

[3]  Daguang Xu,et al.  Federated learning improves site performance in multicenter deep learning without data sharing , 2021, J. Am. Medical Informatics Assoc..

[4]  Dan Boneh,et al.  Differentially Private Learning Needs Better Features (or Much More Data) , 2020, ICLR.

[5]  Harshvardhan Sikka,et al.  Benchmarking Differentially Private Residual Networks for Medical Imagery , 2020, ArXiv.

[6]  知秀 柴田 5分で分かる!? 有名論文ナナメ読み:Jacob Devlin et al. : BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding , 2020 .

[7]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[8]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[9]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[10]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[11]  Bingsheng He,et al.  A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection , 2019, IEEE Transactions on Knowledge and Data Engineering.

[12]  M. Mayer,et al.  Detecting Signs of Depression in Tweets in Spanish: Behavioral and Linguistic Analysis , 2019, Journal of medical Internet research.

[13]  Calton Pu,et al.  Differentially Private Model Publishing for Deep Learning , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[14]  Qiang Yang,et al.  Federated Machine Learning , 2019, ACM Trans. Intell. Syst. Technol..

[15]  Devendra Singh Sachan,et al.  Parameter Sharing Methods for Multilingual Self-Attentional Translation Models , 2018, WMT.

[16]  Yue Zhao,et al.  Federated Learning with Non-IID Data , 2018, ArXiv.

[17]  Timothy Baldwin,et al.  Towards Robust and Privacy-preserving Text Representations , 2018, ACL.

[18]  Zhenguo Li,et al.  Federated Meta-Learning with Fast Convergence and Efficient Communication , 2018, 1802.07876.

[19]  Tassilo Klein,et al.  Differentially Private Federated Learning: A Client Level Perspective , 2017, ArXiv.

[20]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[21]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[22]  Ameet Talwalkar,et al.  Federated Multi-Task Learning , 2017, NIPS.

[23]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[24]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[25]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[26]  S. Saxena,et al.  Comprehensive mental health action plan 2013-2020. , 2015, Eastern Mediterranean health journal = La revue de sante de la Mediterranee orientale = al-Majallah al-sihhiyah li-sharq al-mutawassit.

[27]  Nikolaos Aletras,et al.  An analysis of the user occupational class through Twitter content , 2015, ACL.

[28]  Dirk Hovy,et al.  User Review Sites as a Resource for Large-Scale Sociolinguistic Studies , 2015, WWW.

[29]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[30]  Erik Cambria,et al.  Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[31]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[32]  Amos Beimel,et al.  Private Learning and Sanitization: Pure vs. Approximate Differential Privacy , 2013, APPROX-RANDOM.

[33]  Andreas Haeberlen,et al.  Differential Privacy Under Fire , 2011, USENIX Security Symposium.

[34]  C. Dwork A firm foundation for private data analysis , 2011, Commun. ACM.

[35]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[36]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[37]  C. Dwork,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[38]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[39]  Liu Yang,et al.  Privacy-Adaptive BERT for Natural Language Understanding , 2021, ArXiv.

[40]  Benno Stein,et al.  Evolution of the PAN Lab on Digital Text Forensics , 2019, Information Retrieval Evaluation in a Changing World.

[41]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[42]  Iryna Gurevych,et al.  Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) , 2018, ACL 2018.