SignDS-FL: Local Differentially Private Federated Learning with Sign-based Dimension Selection

Federated Learning (FL) [31] is a decentralized learning mechanism that has attracted increasing attention due to its achievements in computational efficiency and privacy preservation. However, recent research highlights that the original FL framework may still reveal sensitive information of clients’ local data from the exchanged local updates and the global model parameters. Local Differential Privacy (LDP), as a rigorous definition of privacy, has been applied to Federated Learning to provide formal privacy guarantees and prevent potential privacy leakage. However, previous LDP-FL solutions suffer from considerable utility loss with an increase of model dimensionality. Recent work [29] proposed a two-stage framework that mitigates the dimension-dependency problem by first selecting one “important” dimension for each local update and then perturbing the dimension value to construct the sparse privatized update. However, the framework may still suffer from utility loss because of the insufficient per-stage privacy budget and slow model convergence. In this article, we propose an improved framework, SignDS-FL, which shares the concept of dimension selection with Reference [29], but saves the privacy cost for the value perturbation stage by assigning random sign values to the selected dimensions. Besides using the single-dimension selection algorithms in Reference [29], we propose an Exponential Mechanism-based Multi-Dimension Selection algorithm that further improves model convergence and accuracy. We evaluate the framework on a number of real-world datasets with both simple logistic regression models and deep neural networks. For training logistic regression models on structured datasets, our framework yields only a \( \sim \) 1%–2% accuracy loss in comparison to a \( \sim \) 5%–15% decrease of accuracy for the baseline methods. For training deep neural networks on image datasets, the accuracy loss of our framework is less than \( 8\% \) and at best only \( 2\% \) . Extensive experimental results show that our framework significantly outperforms the previous LDP-FL solutions and enjoys an advanced utility-privacy balance.

[1]  Jens Grossklags,et al.  Comprehensive Analysis of Privacy Leakage in Vertical Federated Learning During Prediction , 2022, Proc. Priv. Enhancing Technol..

[2]  Jens Grossklags,et al.  Privacy-Preserving High-dimensional Data Collection with Federated Generative Autoencoder , 2021, Proc. Priv. Enhancing Technol..

[3]  Leandros Tassiulas,et al.  Model Pruning Enables Efficient Federated Learning on Edge Devices , 2019, IEEE transactions on neural networks and learning systems.

[4]  S. Harmon,et al.  Federated semi-supervised learning for COVID region segmentation in chest CT using multi-national data from China, Italy, Japan , 2020, Medical Image Analysis.

[5]  Xin Liu,et al.  Adaptive Federated Dropout: Improving Communication Efficiency and Generalization for Federated Learning , 2020, IEEE INFOCOM 2021 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS).

[6]  Jun Zhao,et al.  Local Differential Privacy-Based Federated Learning for Internet of Things , 2020, IEEE Internet of Things Journal.

[7]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[8]  Wenqi Wei,et al.  Secure and Utility-Aware Data Collection with Condensed Local Differential Privacy , 2019, IEEE Transactions on Dependable and Secure Computing.

[9]  Wenqi Wei,et al.  LDP-Fed: federated learning with local differential privacy , 2020, EdgeSys@EuroSys.

[10]  Masatoshi Yoshikawa,et al.  FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection , 2020, DASFAA.

[11]  Tianjian Chen,et al.  FedVision: An Online Visual Object Detection Platform Powered by Federated Learning , 2020, AAAI.

[12]  H. B. McMahan,et al.  Generative Models for Effective ML on Private, Decentralized Datasets , 2019, ICLR.

[13]  Boi Faltings,et al.  Federated Generative Privacy , 2019, IEEE Intelligent Systems.

[14]  Wei Yang Bryan Lim,et al.  Federated Learning in Mobile Edge Networks: A Comprehensive Survey , 2019, IEEE Communications Surveys & Tutorials.

[15]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[16]  Yang Qiang,et al.  Federated Recommendation Systems , 2019, 2019 IEEE International Conference on Big Data (Big Data).

[17]  Ilana Segall,et al.  Federated Learning for Ranking Browser History Suggestions , 2019, ArXiv.

[18]  Daguang Xu,et al.  Privacy-preserving Federated Brain Tumour Segmentation , 2019, MLMI@MICCAI.

[19]  Song Han,et al.  Deep Leakage from Gradients , 2019, NeurIPS.

[20]  Ge Yu,et al.  Collecting and Analyzing Multidimensional Data with Local Differential Privacy , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[21]  Rui Zhang,et al.  A Hybrid Approach to Privacy-Preserving Federated Learning , 2018, Informatik Spektrum.

[22]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[23]  Hubert Eichner,et al.  APPLIED FEDERATED LEARNING: IMPROVING GOOGLE KEYBOARD QUERY SUGGESTIONS , 2018, ArXiv.

[24]  Lingxiao Wang,et al.  Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization , 2018, NeurIPS.

[25]  Gaurav Kapoor,et al.  Protection Against Reconstruction and Its Applications in Private Federated Learning , 2018, ArXiv.

[26]  Sebastian Caldas,et al.  Expanding the Reach of Federated Learning by Reducing Client Resource Requirements , 2018, ArXiv.

[27]  Liusheng Huang,et al.  PrivSet: Set-Valued Data Analyses with Locale Differential Privacy , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[28]  Kamyar Azizzadenesheli,et al.  signSGD: compressed optimisation for non-convex problems , 2018, ICML.

[29]  H. Brendan McMahan,et al.  Learning Differentially Private Recurrent Language Models , 2017, ICLR.

[30]  Cecilia M. Procopiuc,et al.  PrivBayes , 2017 .

[31]  Roland Vollgraf,et al.  Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms , 2017, ArXiv.

[32]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[33]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[34]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[35]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[36]  Yin Yang,et al.  Collecting and Analyzing Data from Smart Device Users with Local Differential Privacy , 2016, ArXiv.

[37]  Martin J. Wainwright,et al.  Minimax Optimal Procedures for Locally Private Estimation , 2016, ArXiv.

[38]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[39]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[40]  Jun Zhang,et al.  PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.

[41]  Davide Anguita,et al.  A Public Domain Dataset for Human Activity Recognition using Smartphones , 2013, ESANN.

[42]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[43]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[44]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[45]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[46]  Ivan Damgård,et al.  A Generalisation, a Simplification and Some Applications of Paillier's Probabilistic Public-Key System , 2001, Public Key Cryptography.

[47]  I. Damgård,et al.  A Generalisation, a Simplification and some Applications of Paillier’s Probabilistic Public-Key System , 2000 .

[48]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[49]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[50]  Jonathan J. Hull,et al.  A Database for Handwritten Text Recognition Research , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[51]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[52]  Andrew Chi-Chih Yao,et al.  Protocols for Secure Computations (Extended Abstract) , 1982, FOCS.

[53]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.