Compressed Private Aggregation for Scalable and Robust Federated Learning over Massive Networks

Federated learning (FL) is an emerging paradigm that allows a central server to train machine learning models using remote users' data. Despite its growing popularity, FL faces challenges in preserving the privacy of local datasets, its sensitivity to poisoning attacks by malicious users, and its communication overhead. The latter is additionally considerably dominant in large-scale networks. These limitations are often individually mitigated by local differential privacy (LDP) mechanisms, robust aggregation, compression, and user selection techniques, which typically come at the cost of accuracy. In this work, we present compressed private aggregation (CPA), that allows massive deployments to simultaneously communicate at extremely low bit rates while achieving privacy, anonymity, and resilience to malicious users. CPA randomizes a codebook for compressing the data into a few bits using nested lattice quantizers, while ensuring anonymity and robustness, with a subsequent perturbation to hold LDP. The proposed CPA is proven to result in FL convergence in the same asymptotic rate as FL without privacy, compression, and robustness considerations, while satisfying both anonymity and LDP requirements. These analytical properties are empirically confirmed in a numerical study, where we demonstrate the performance gains of CPA compared with separate mechanisms for compression and privacy for training different image classification models, as well as its robustness in mitigating the harmful effects of malicious users.

[1]  Rafael G. L. D'Oliveira,et al.  CPA: Compressed Private Aggregation for Scalable Federated Learning Over Massive Networks , 2023, ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Nir Shlezinger,et al.  Joint Privacy Enhancement and Quantization in Federated Learning , 2022, IEEE Transactions on Signal Processing.

[3]  O. Simeone,et al.  Leveraging Channel Noise for Sampling and Privacy via Quantized Federated Langevin Monte Carlo , 2022, 2022 IEEE 23rd International Workshop on Signal Processing Advances in Wireless Communication (SPAWC).

[4]  J. Rosenblatt,et al.  Quantization , 2022, What Is a Quantum Field Theory?.

[5]  Sanjeev Arora,et al.  Evaluating Gradient Inversion Attacks and Defenses in Federated Learning , 2021, NeurIPS.

[6]  Daniel Rueckert,et al.  Adversarial interference and its mitigations in privacy-preserving collaborative machine learning , 2021, Nature Machine Intelligence.

[7]  Xuefei Yin,et al.  A Comprehensive Survey of Privacy-preserving Federated Learning , 2021, ACM Comput. Surv..

[8]  Salim El Rouayheb,et al.  Private Multi-Group Aggregation , 2021, 2021 IEEE International Symposium on Information Theory (ISIT).

[9]  Lingjuan Lyu,et al.  DP-SIGNSGD: When Efficiency Meets Privacy and Robustness , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[10]  Yonina C. Eldar,et al.  Communication-efficient federated learning , 2021, Proceedings of the National Academy of Sciences.

[11]  Pavlo Molchanov,et al.  See through Gradients: Image Batch Recovery via GradInversion , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Yonina C. Eldar,et al.  Federated Learning: A signal processing perspective , 2021, IEEE Signal Processing Magazine.

[13]  Onur Günlü,et al.  Federated Learning with Local Differential Privacy: Trade-Offs Between Privacy, Utility, and Communication , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Ali Dehghantanha,et al.  A survey on security and privacy of federated learning , 2021, Future Gener. Comput. Syst..

[15]  Jun Zhao,et al.  A Comprehensive Survey on Local Differential Privacy toward Data Statistics and Analysis , 2020, Sensors.

[16]  Xingxing Xiong,et al.  A Comprehensive Survey on Local Differential Privacy , 2020, Secur. Commun. Networks.

[17]  Yonina C. Eldar,et al.  Over-the-Air Federated Learning From Heterogeneous Data , 2020, IEEE Transactions on Signal Processing.

[18]  Philip S. Yu,et al.  LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy , 2020, IJCAI.

[19]  Yonina C. Eldar,et al.  UVeQFed: Universal Vector Quantization for Federated Learning , 2020, IEEE Transactions on Signal Processing.

[20]  Rickmer Braren,et al.  Secure, privacy-preserving and federated machine learning in medical imaging , 2020, Nature Machine Intelligence.

[21]  Yansheng Wang,et al.  Federated Latent Dirichlet Allocation: A Local Differential Privacy Based Framework , 2020, AAAI.

[22]  Jun Zhao,et al.  Local Differential Privacy-Based Federated Learning for Internet of Things , 2020, IEEE Internet of Things Journal.

[23]  Michael Moeller,et al.  Inverting Gradients - How easy is it to break privacy in federated learning? , 2020, NeurIPS.

[24]  Masatoshi Yoshikawa,et al.  FedSel: Federated SGD under Local Differential Privacy with Top-k Dimension Selection , 2020, DASFAA.

[25]  K. Leung,et al.  Adaptive Gradient Sparsification for Efficient Federated Learning: An Online Learning Approach , 2020, 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS).

[26]  Bo Zhao,et al.  iDLG: Improved Deep Leakage from Gradients , 2020, ArXiv.

[27]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[28]  Jinyuan Jia,et al.  Local Model Poisoning Attacks to Byzantine-Robust Federated Learning , 2019, USENIX Security Symposium.

[29]  H. Vincent Poor,et al.  Federated Learning With Differential Privacy: Algorithms and Performance Analysis , 2019, IEEE Transactions on Information Forensics and Security.

[30]  Sashank J. Reddi,et al.  SCAFFOLD: Stochastic Controlled Averaging for Federated Learning , 2019, ICML.

[31]  Aryan Mokhtari,et al.  FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization , 2019, AISTATS.

[32]  Anit Kumar Sahu,et al.  Federated Learning: Challenges, Methods, and Future Directions , 2019, IEEE Signal Processing Magazine.

[33]  G. Giannakis,et al.  RSA: Byzantine-Robust Stochastic Aggregation Methods for Distributed Learning from Heterogeneous Datasets , 2019, Proceedings of the AAAI Conference on Artificial Intelligence.

[34]  Xukan Ran,et al.  Deep Learning With Edge Computing: A Review , 2019, Proceedings of the IEEE.

[35]  Xiang Li,et al.  On the Convergence of FedAvg on Non-IID Data , 2019, ICLR.

[36]  Song Han,et al.  Deep Leakage from Gradients , 2019, NeurIPS.

[37]  Marco Canini,et al.  Natural Compression for Distributed Deep Learning , 2019, MSML.

[38]  Matthew Reimherr,et al.  Elliptical Perturbations for Differential Privacy , 2019, NeurIPS.

[39]  Faramarz Fekri,et al.  Nested Dithered Quantization for Communication Reduction in Distributed Training , 2019, ArXiv.

[40]  Kamyar Azizzadenesheli,et al.  signSGD with Majority Vote is Communication Efficient and Fault Tolerant , 2018, ICLR.

[41]  Dan Alistarh,et al.  The Convergence of Sparsified Gradient Methods , 2018, NeurIPS.

[42]  Yonina C. Eldar,et al.  Hardware-Limited Task-Based Quantization , 2018, 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC).

[43]  John M. Abowd,et al.  The U.S. Census Bureau Adopts Differential Privacy , 2018, KDD.

[44]  Sebastian U. Stich,et al.  Local SGD Converges Fast and Communicates Little , 2018, ICLR.

[45]  Takayuki Nishio,et al.  Client Selection for Federated Learning with Heterogeneous Resources in Mobile Edge , 2018, ICC 2019 - 2019 IEEE International Conference on Communications (ICC).

[46]  Kannan Ramchandran,et al.  Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates , 2018, ICML.

[47]  Kamyar Azizzadenesheli,et al.  signSGD: compressed optimisation for non-convex problems , 2018, ICML.

[48]  William J. Dally,et al.  Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training , 2017, ICLR.

[49]  Bruno Sericola,et al.  Distributed deep learning on edge-devices in the Parameter Server Model , 2017 .

[50]  Teng Wang,et al.  Survey on Improving Data Utility in Differentially Private Sequential Data Publishing , 2017, IEEE Transactions on Big Data.

[51]  Kenneth Heafield,et al.  Sparse Communication for Distributed Gradient Descent , 2017, EMNLP.

[52]  Peter Richtárik,et al.  Federated Learning: Strategies for Improving Communication Efficiency , 2016, ArXiv.

[53]  Dan Alistarh,et al.  QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.

[54]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[55]  Pramod K. Varshney,et al.  Distributed Inference with Byzantine Data: State-of-the-Art Review on Data Falsification Attacks , 2013, IEEE Signal Processing Magazine.

[56]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[57]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[58]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[59]  L. Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[60]  Shlomo Shamai,et al.  Nested linear/Lattice codes for structured multiterminal binning , 2002, IEEE Trans. Inf. Theory.

[61]  Guozhong An,et al.  The Effects of Adding Noise During Backpropagation Training on a Generalization Performance , 1996, Neural Computation.

[62]  Meir Feder,et al.  On lattice quantization noise , 1994, Proceedings of IEEE Data Compression Conference (DCC'94).

[63]  R. Gray,et al.  Dithered Quantizers , 1993, Proceedings. 1991 IEEE International Symposium on Information Theory.

[64]  John Vanderkooy,et al.  Quantization and Dither: A Theoretical Survey , 1992 .

[65]  Meir Feder,et al.  On universal quantization by randomized uniform/lattice quantizers , 1992, IEEE Trans. Inf. Theory.

[66]  W. Fischer,et al.  Sphere Packings, Lattices and Groups , 1990 .

[67]  S L Warner,et al.  Randomized response: a survey technique for eliminating evasive answer bias. , 1965, Journal of the American Statistical Association.

[68]  Rafael G. L. D'Oliveira,et al.  CABDRIVER: Concentration to Accurate Boundaries while Distorting Randomly Input Variables to Elude Recognition , 2021, WSA.

[69]  L. Gommans,et al.  Compressive Differentially-Private Federated Learning Through Universal Vector Quantization , 2021 .

[70]  F. Alajaji,et al.  Lectures Notes in Information Theory , 2000 .

[71]  M. Berthold,et al.  International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems , 1998 .