Gradient Obfuscation Gives a False Sense of Security in Federated Learning

Federated learning has been proposed as a privacy-preserving machine learning framework that enables multiple clients to collaborate without sharing raw data. However, client privacy protection is not guaranteed by design in this framework. Prior work has shown that the gradient sharing strategies in federated learning can be vulnerable to data reconstruction attacks. In practice, though, clients may not transmit raw gradients considering the high communication cost or due to privacy enhancement requirements. Empirical studies have demonstrated that gradient obfuscation, including intentional obfuscation via gradient noise injection and unintentional obfuscation via gradient compression, can provide more privacy protection against reconstruction attacks. In this work, we present a new data reconstruction attack framework targeting the image classification task in federated learning. We show that commonly adopted gradient postprocessing procedures, such as gradient quantization, gradient sparsification, and gradient perturbation, may give a false sense of security in federated learning. Contrary to prior studies, we argue that privacy enhancement should not be treated as a byproduct of gradient compression. Additionally, we design a new method under the proposed framework to reconstruct the image at the semantic level. We quantify the semantic privacy leakage and compare with conventional based on image similarity scores. Our comparisons challenge the image data leakage evaluation schemes in the literature. The results emphasize the importance of revisiting and redesigning the privacy protection mechanisms for client data in existing federated learning algorithms.

[1]  Ming-Hsuan Yang,et al.  GAN Inversion: A Survey , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Song Guo,et al.  A Survey on Gradient Inversion: Attacks, Defenses and Future Directions , 2022, IJCAI.

[3]  T. Goldstein,et al.  Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification , 2022, ICML.

[4]  Martin T. Vechev,et al.  Bayesian Framework for Gradient Leakage , 2021, ICLR.

[5]  T. Goldstein,et al.  Robbing the Fed: Directly Obtaining Private Data in Federated Learning with Modified Models , 2021, ArXiv.

[6]  Lihi Zelnik-Manor,et al.  Multi-label Classification with Partial Annotations using Class-aware Selective Loss , 2021, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Antoine Boutet,et al.  MixNN: protection of federated learning against inference attacks by mixing neural network layers , 2021, Middleware.

[8]  Marco Seeland,et al.  PRECODE - A Generic Model Extension to Prevent Deep Gradient Leakage , 2021, 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV).

[9]  Sanjeev Arora,et al.  Evaluating Gradient Inversion Attacks and Defenses in Federated Learning , 2021, NeurIPS.

[10]  Jungseul Ok,et al.  Gradient Inversion with Generative Image Prior , 2021, NeurIPS.

[11]  Qiang Yang,et al.  Federated Deep Learning with Bayesian Privacy , 2021, ArXiv.

[12]  Suhas Diggavi,et al.  A Field Guide to Federated Optimization , 2021, ArXiv.

[13]  Wenqi Wei,et al.  Gradient-Leakage Resilient Federated Learning , 2021, 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS).

[14]  Aritra Dutta,et al.  GRACE: A Compressed Communication Framework for Distributed Machine Learning , 2021, 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS).

[15]  Gu-Yeon Wei,et al.  Gradient Disaggregation: Breaking Privacy in Federated Learning by Reconstructing the User Participant Matrix , 2021, ICML.

[16]  Zhongshu Gu,et al.  Separation of Powers in Federated Learning (Poster Paper) , 2021, ResilientFL.

[17]  Somesh Jha,et al.  Is Private Learning Possible with Instance Encoding? , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[18]  Pavlo Molchanov,et al.  See through Gradients: Image Batch Recovery via GradInversion , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  H. Vincent Poor,et al.  Federated Learning Meets Blockchain in Edge Computing: Opportunities and Challenges , 2021, IEEE Internet of Things Journal.

[20]  Luc Van Gool,et al.  Designing a Practical Degradation Model for Deep Blind Image Super-Resolution , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[21]  Hang Liu,et al.  TAG: Gradient Attack on Transformer-based Language Models , 2021, EMNLP.

[22]  Farinaz Koushanfar,et al.  A Taxonomy of Attacks on Federated Learning , 2021, IEEE Security & Privacy.

[23]  Wei Chen,et al.  Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning , 2021, ICLR.

[24]  Yong Liu,et al.  A Quantitative Metric for Privacy Leakage in Federated Learning , 2021, ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  P. Kairouz,et al.  The Distributed Discrete Gaussian Mechanism for Federated Learning with Secure Aggregation , 2021, ICML.

[26]  Aidmar Wainakh,et al.  Label Leakage from Gradients in Distributed Machine Learning , 2021, 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC).

[27]  Jingwei Sun,et al.  Soteria: Provable Defense against Privacy Leakage in Federated Learning from Representation Perspective , 2020, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Claude Castelluccia,et al.  Compression Boosts Differentially Private Federated Learning , 2020, 2021 IEEE European Symposium on Security and Privacy (EuroS&P).

[29]  Richard Nock,et al.  Advances and Open Problems in Federated Learning , 2019, Found. Trends Mach. Learn..

[30]  Hamed Haddadi,et al.  Quantifying and Localizing Private Information Leakage from Neural Network Gradients , 2021 .

[31]  Suhas N. Diggavi,et al.  Shuffled Model of Differential Privacy in Federated Learning , 2021, AISTATS.

[32]  Min Yang,et al.  Theory-Oriented Deep Leakage from Gradients via Linear Equation Solver , 2020, ArXiv.

[33]  Kai Li,et al.  InstaHide: Instance-hiding Schemes for Private Distributed Learning , 2020, ICML.

[34]  Wenqi Wei,et al.  A Framework for Evaluating Client Privacy Leakages in Federated Learning , 2020, ESORICS.

[35]  Maria Rigaki,et al.  A Survey of Privacy Attacks in Machine Learning , 2020, ArXiv.

[36]  Yanzhao Wu,et al.  A Framework for Evaluating Gradient Leakage Attacks in Federated Learning , 2020, ArXiv.

[37]  Michael Moeller,et al.  Inverting Gradients - How easy is it to break privacy in federated learning? , 2020, NeurIPS.

[38]  Han Yu,et al.  Threats to Federated Learning: A Survey , 2020, ArXiv.

[39]  Bo Zhao,et al.  iDLG: Improved Deep Leakage from Gradients , 2020, ArXiv.

[40]  D. Song,et al.  The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Phillip B. Gibbons,et al.  The Non-IID Data Quagmire of Decentralized Machine Learning , 2019, ICML.

[42]  Aryan Mokhtari,et al.  FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization , 2019, AISTATS.

[43]  Phillip Isola,et al.  On the "steerability" of generative adversarial networks , 2019, ICLR.

[44]  Anit Kumar Sahu,et al.  Federated Optimization in Heterogeneous Networks , 2018, MLSys.

[45]  Jordi Pont-Tuset,et al.  The Open Images Dataset V4 , 2018, International Journal of Computer Vision.

[46]  Yang Liu,et al.  BatchCrypt: Efficient Homomorphic Encryption for Cross-Silo Federated Learning , 2020, USENIX ATC.

[47]  Tzu-Ming Harry Hsu,et al.  Measuring the Effects of Non-Identical Data Distribution for Federated Visual Classification , 2019, ArXiv.

[48]  Song Han,et al.  Deep Leakage from Gradients , 2019, NeurIPS.

[49]  Mehran Ebrahimi,et al.  EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning , 2019, ArXiv.

[50]  Jeff Donahue,et al.  Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[51]  Vitaly Shmatikov,et al.  Exploiting Unintended Feature Leakage in Collaborative Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[52]  Martin Jaggi,et al.  Sparsified SGD with Memory , 2018, NeurIPS.

[53]  Sanjiv Kumar,et al.  cpSGD: Communication-efficient and differentially-private distributed SGD , 2018, NeurIPS.

[54]  Kamyar Azizzadenesheli,et al.  signSGD: compressed optimisation for non-convex problems , 2018, ICML.

[55]  Alexei A. Efros,et al.  The Unreasonable Effectiveness of Deep Features as a Perceptual Metric , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Kenneth Heafield,et al.  Sparse Communication for Distributed Gradient Descent , 2017, EMNLP.

[57]  Tribhuvanesh Orekondy,et al.  Towards a Visual Privacy Advisor: Understanding and Predicting Privacy Risks in Images , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[58]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Dan Alistarh,et al.  QSGD: Communication-Optimal Stochastic Gradient Descent, with Applications to Training Neural Networks , 2016, 1610.02132.

[60]  Blaise Agüera y Arcas,et al.  Communication-Efficient Learning of Deep Networks from Decentralized Data , 2016, AISTATS.

[61]  Sarvar Patel,et al.  Practical Secure Aggregation for Federated Learning on User-Held Data , 2016, ArXiv.

[62]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[63]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[64]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[66]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[67]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[68]  Pierre Baldi,et al.  Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.

[69]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[70]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[71]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[72]  I. Good On the Application of Symmetric Dirichlet Distributions and their Mixtures to Contingency Tables , 1976 .

[73]  P. Jaccard THE DISTRIBUTION OF THE FLORA IN THE ALPINE ZONE.1 , 1912 .